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The idea for the method of scaling tests which I am about to 
describe occurred to me while trying to tease out the logic of some of 
the well-known educational scales and mental age scales. The authors 
of educational scale monographs seldom give an adequate discussion 
| of the assumptions and of the logic of their scale constructions. Of 
course, it is not to be expected that any one method of scaling will 
apply to all types of educational and psychological test material, and 
I am not attempting to offer a single solution for all scaling problems. 

My present method applies especially to test items that can be graded 
right or wrong, and for which separate norms are to be constructed for 
successive age- or grade-groups. By slight modification it is also 
applicable to tests that give a numerical score. The method is, there- 
fore, applicable to such test material as the Binet test questions. It 
is also applicable to educational test material such as arithmetic 
problems or spelling, and to other types of content in which answers can 
be definitely graded as right or wrong and for which we want separate 
tabulations for children of successive ages or of successive grades. 

In order to explain the method I shall follow through an example 
with the Binet test questions. The method is by no means limited 
to mental age data, but the explanation of the method will perhaps be 
clearer if I explain it consistently with one type of material as an 
example. For the actual data I have selected the tables prepared by 
Burt (1) from his Binet test results on 3000 London school children. 

The first part of the analysis will be similar to the methods already 
in general use. In Fig. 1 we have a frequency surface. Let it repre- 
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sent the distribution of Binet test intelligence for seven-year-old 
children. It is a normal surface, and this is about the only assumption 
that we shall make. It has been shown by Jaederholm (2) and others 
to be a reasonable assumption. 

Those children who are represented toward the right end of the 
surface in Fig. 1 are bright for their age, while those toward the left 
side are below the average of their age. Evidently the average 
intelligence of seven-year-old children would be represented by the 
point d on the scale since that point is at the mean of the distribution. 
But the brightness of these children is judged by their answers to a 
lot of questions which are more or less roughly graded according to 
difficulty in the mental-age scales. We shall therefore locate these 
test questions on the scale as landmarks of different levels of intellec- 
tual growth. These questions, when so located on the scale, will serve 
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Fig. 1. 


our purpose in arriving at a numerical measurement of scale distances. 
Let us assume that a particular test question has been located at the 
point d onthe scale. This means that 50 per cent of the seven-year-old 
children answered it correctly. It is clearly an average question for 
that age since half of |the children get it and half of them fail on it. 
Its difficulty represents the mid-point of the distribution of test intel- 
ligence for children of this age. This percentage rating of the test 
question corresponds to the fact that one-half of the surface is to the 
right of the point d. If this is our basis of locating the test questions 
on the scale, it is clear that the difficult questions will be located 
toward the right of the mid-point, and that the easier questions will 
be located toward the left of that point. Consider, for example, the 
question which has been locate.’ at the point e. It is one of the more 
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dificult questions. It is correctly answered by only 30 per cent of the 
seven-year-old children and it is therefore located at such a point on 
the base line that 30 per cent of the whole surface is to the right of it. 

If we imagine a slightly older group of children whose mental 
growth has advanced a little beyond that of the seven-year-old 
children, we could imagine the average of this older group of children 
to be at the point e. They might have 50 per cent right answers for 
the question which we located at e on the scale for the seven-year-old 
children. It is clear that the distributions of test-intelligence for older 
children will move toward the right, and that the distributions for 
the younger children will, of course, be nearer the left. Now suppose 
that we find a test question which more than half of the seven-year-old 
children can answer correctly. Evidently it should be located at a 
point below the average for the children of this age. Such a question 
is shown at the point c. It is evident that the average seven-year-old 
child is one who is exceeded by 50 per cent of children of his own age 
and who exceeds 50 per cent of children of his own age. In other 
words, an average child is one who is at the middle of the distribution 
of intelligence for children of his age. Now, if a question is answered 
correctly by 60 per cent of seven-year-old children, such a question 
should be located below the average of that age group. 

Let us now consider a test question which is answered correctly 
by 100 per cent of the seven-year-old children. It is so easy that all 
the children in this age group have passed it; the whole distribution 
is to the right of that question. The question, then, should be located 
at a point such as }, or still farther back such as the point a. In fact, 
if a test question is answered correctly by 100 per cent of a certain 
age group, we cannot locate the point definitely on the scale. All 
we know about it is that it is somewhere to the left, beyond the 
distribution. 

The same reasoning applies to a test question which is so difficult 
that none of the seven-year-old children can answer it correctly. 
Such a question marks a point on the scale so far ahead that the distri- 
bution of intelligence of seven-year-old children has not yet covered it. 
Such a point might be at f or still farther on at the point g. When 
we find this type of question, we may expect that an older age group, 
with its distribution of intelligence farther toward the right on the 
scale, will give a higher percentage of right answers and the question 
can then be definitely located on the scale. We see, therefore, that a 
question cannot be definitely assigned to a point on the scale if it is 
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either so difficult that none of the children in a given age group can 
answer it or so easy that they can all give the correct answer. The 
percentages of right answers must be above 0 and below 100. 

We can now summarize the procedure so far covered. We assume 
the distribution of intelligence of children of any given age group to 
be approximately normal. Since test-intelligence is indicated by the 
correctness of answers to test questions, it is legitimate to designate 
the points on the scale of test-intelligence by means of the questions as 
landmarks. Lach test question is located at a point on the scale so chosen 
that the percentage of the distribution to the right of that pointis equal 
to the percentage of right answers to the test question for children of the 
specified age. 

In order to facilitate the handling of such data, we give a numerical 
value to each test question for each age with the standard deviation 
of the distribution as a unit of measurement. All test questions that 
are located above the average of the group have positive values, while 
all test questions that are located below the average of the group have 
negative values. Thus a test question which has a numerical value 
of +1.¢ for seven-year-old children is answered correctly by about 
one-sixth of the children of that age. But its value for the next higher 
age group would, of course, be lower, because more of the eight-year-old 
children would answer it correctly. We shall therefore designate 
numerically each test question for each age group. The numerical 
designation will be different for each age group because the distribution of 
intelligence moves to the right on the scale with increase in age. 

Now, what we want is a series of frequency curves, all drawn on 
the same base line, that will truly represent the distributions of test- 
intelligence of children of successive chronological ages. On_ this 
base line we shall locate each of the test questions. The curves must 
be so drawn that no matter where a test question is located on this 
base line the proportion of any given age distribution to the right 
of that question will correspond to the actually observed percentage 
of children of that age group who answer the question correctly. 

In order to draw a series of curves to represent the distributions 
of test-intelligence of children of successive ages we can draw the 
first one anywhere on our scale irrespective of the units or the origin. 
Our problem is then to draw the others on the same scale in such a 
way that their interrelations shall correspond with the actually 
observed percentages of right answers. 
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There are two facts that must be established for each curve before 
it can be drawn, namely its mean and its standard deviation. It does 
not matter at all where we locate the origin or zero. Let us arbitrarily 
locate it at the mean of the lowest age group that we are dealing with. 
It can later be shifted to any convenient place. Similarly, it does not 
matter what unit of measurement we use for the base line. Let us 
for convenience adopt as the unit of measurement the standard devia- 
tion of the lowest age group in any particular study. It can also be 
shifted later to any convenient scale. 

In Fig. 2 let the frequency curve A represent the distribution of 
test ability for children of any specified age. Let the curve B represent 
the distribution for an older age group. The base line represents 


























achievement, or relative difficulty of test questions, while the ordinates 
represent relative frequencies of children at each degree of achieve- 
ment. The means of the two distributions are designated M, and M2, 
respectively. Naturally we should expect the higher age group to 
have a higher mean on the scale and, therefore, the two curves are so 
drawn that the mean M,; is higher than the mean M;. 

Let the small circle represent any particular test question. The 
shaded area in the B surface represents the proportion of the older 
age group who can answer the question correctly. The remaining 
unshaded part of the distribution represents the proportion who 
fail on that question. The same reasoning applies to the A distribu- 
tion. There is a larger proportion of the older children who can answer 
the question, and that is reasonable because B represents children 
older than A. 
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If we know the percentage of children of different ages who can 
answer each question, it is possible to locate the questions on an 
absolute scale, and it is also possible to locate the means of the succes- 
sive age groups on the same absolute scale, and to determine the 
standard deviations of the successive ages on the same scale. The 
present method assumes that the distribution of abilities is normal, 
but it does not assume that the standard deviations of the successive 
age or grade groups are the same. 

Let X, represent the deviation from the mean, M,, of a particular 
question for children of a particular age. This is determined in the 
usual way from the percentage of these children who can answer the 
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Fig. 3. 


question correctly. Of course, X; is measured in terms of the standard 
deviation, 01, of the distribution. In the same manner, let X2 repre- 
sent the deviation from the mean, Mz, of the same question for an 
older age group. It is, of course, expressed in terms of the standard 
deviation, o2, of its own distribution. 

It is clear in Fig. 2 that M, plus X10; must be equal to M; plus 
X22 because they are measurements to the same point on the scale, 
both measurements representing the same test question by two 
different age groups. Hence, 

Mi + Xwi = M2 + Xoo2 (1) 
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or, ne u 
—* 4X, =— 4x 
01 01 01 
Transposing, a - 
x, x2) + (=) 5 


This equation is linear if we plot the observed values of X, against 
the corresponding values of X, for the same test questions. The 
constants will be as follows: 

2? = Slope (3) 


v1 
M, — M, 
71 
In Fig. 3 the paired values of X, and X, have been plotted for two 
adjacent age groups and it is immediately apparent that the relation 
is linear. The two mean lines for X, and X2 are determined from the 
following relations, 


= X,-intercept (4) 
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in which m, is the average of the observed X-values for the lower age 
group, and mz is, similarly, the average of the observed X-values for 
the upper age group, and n is the number of overlapping test ques- 
tions for the two age groups. It should be noted that m in this nota- 
tion represents the average of all the X-values in the data whereas M 
represents the mean of the distribution on the absolute scale. 

The next step in the calculation is to determine the numerical 
values of the constants of the line in Fig. 3 because these two constants 
will enable us to draw the two distributions on the same base line. 
Let the deviations from m, and mz be defined as follows: 

z%1= xX im” Mm 
z2= xX 2—~ Mme 
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Then the line of relation, assuming r = 1., is as follows: 
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or 
X, — m, = —(X_ — ms) 
82 
Transposing, . 
X, = “x + (m a iget (5) 
1 So 2 1 "30 
But equations (2) and (5) represent the same linear plot. Hence: 
The slope = ik 
Se 01 
or 
s 
cg = 5.7! (6) 
and, similarly, 
M, —_ M, $1 
= mM—- — Me 
C1 82 
or 
M> = oi( mi — = ms) + M, (7) 


Equations (6) and (7) are the fundamental equations for our 
scaling purposes. By means of equation (6) we can ascertain the 
standard deviation of one age distribution when the other is assumed 
or known, and by means of equation (7) we can locate the mean of 
one of the age distributions when the other is assumed or known. 
These two formulas should prove quite generally useful in scaling 
tests. 

Let us assume an arbitrary origin at the mean of the distribution 
for three-and-one-half-year-old children in Burt’s data, and let us 
assume as our unit of measurement on the absolute scale the standard 
deviation of the same age group. The scaling can later be changed 
into any desired unit of measurement and it can be easily adjusted 
to any desired origin. | 

In Table I we have the data from Burt for seven- and eight-year-old 
children. In the first column is the numerical order of the Binet 
questions as arranged by Burt. Note that the list does not extend 
below question 10 nor above question 59. The reason for this is that 
beyond this range one of the age groups received either 0 or 100 per 
cent right answers, which renders scaling for those questions impossible 
for these two age groups. The questions so eliminated can, however, 
be sealed for higher or lower age groups. The second column gives 
a brief description or designation of the Binet questions to facilitate 
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identification. The next column shows the actual percentage of seven- 
year-old children who answered each question correctly. The next 
column after that gives the similar facts for the eight-year-old children. 
These data are all copied directly from the tables of Burt for London 
school children. 

The next two columns show the relative difficulty of each question 
for each age group expressed in terms of the standard deviation. 
These figures are obtained directly from the two percentage columns. 
Thus, for example, question 40 was answered correctly by 39.1 per cent 
of the seven-year-old children. Since less than half of the children 
in this age group answered this question correctly, that question 
belongs on the positive side of the difficulty scale. Its sigma standing 
is +.28 which is read directly from a table of areas of the probability 
surface.! In other words, at point +.28¢ there remains 39.1 per cent 
of the surface to the right of that point. 

The same question, 40, was answered correctly by 68.2 per cent of 
the eight-year-old children. Since more than half of the children in 
this age group answered this question correctly, the question belongs 
on the negative side of the mean for the eight-year-old children. Its 
exact location is —.47¢, since 68.2 per cent of the whole surface lies 
above that point. It should be noted that the majority of the seven- 
year-old children fail on this question and that the majority of the 
eight-year children succeed with it. That is the reason for the 
difference in sign. 

In the last two columns are recorded the squares of the sigma 
values. These are used in calculating the standard deviations 
8; and 83. 

In Table II are shown the calculations corresponding to the data 
of Table I. Note that the scale values of M; and of oc; are here known 
because they have been derived from the similar calculations for the 
lower age groups. 

Tables I and II show all the calculations for establishing the values 
of Mg, and os. Similar tables can readily be prepared for any two 
adjacent age groups in order to extend the determinations to any 
desired age range. 

In Table III we have the result of such calculations for the age 


range 3 to 14, inclusive, according to Burt’s data. In the first column 

1T found Kelley’s tables of areas of the probability surface especially useful 
for these calculations. See also Table I, page 96, of Monroe, ‘Theory of Educa- 
tional Measurement.” 
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is recorded the average age of each successive age group. The second 
column shows the scale value of the mean test-intelligence of children 
of each age group, expressed in terms of the standard deviation of 
three and one-half-year-old children and with the mean of that age 
group as an origin. In the third column is shown the standard devia- 
tion of each age group with the same unit of measurement. The 
calculations for each age were carried out as shown in Tables I and II. 

The first calculation is made for the lowest age groups, three and 
four, and the formulas are somewhat simplified for that determination 
in view of the fact that M; is zero and a; is unity, these two values being 
assumed for the construction of an absolute scale. 

In Fig. 4 the data of Table III have been represented graphically. 
The base line represents chronological age and the ordinates represent 
the absolute scale of Binet test intelligence. The middle curve is 
taken directly from column 2 of Table III and it shows the mean intel- 
ligence of children of successive ages. Note that the points on the 
curves are plotted at the half-year ages to correspond with the classifi- 
cation of the children in the original data of Burt according to their 
last birthday. A striking feature of this curve is that it continues to 
rise even at the age of 14 with no indication of reaching a level. It 
certainly looks as though the kind of intelligence which is measured 
by the Binet tests and their variations continues to grow as rapidly 
at the age of 14 as it does at the age of 9. This conclusion contradicts 
the statements frequently made to the effect that mental test intelli- 
gence approaches an adult and more or less stable level at 14 to 16 or 
18 years. The appearance of these curves indicates that the growth 
of test intelligence continues beyond the age of 14. The continuity 
of the curves is such that one can hardly imagine that they will bump 
into an adult final level of some kind at 16 or 18 and then stay there. 
When we talk about adult intelligence as having been reached some- 
where in the middle of the ’teens, we are not sufficiently cautious in 
recognizing the mechanical limitations of the scale. If the scale stops 
at 16 so that no one can attain a rating higher than that, it is clear 
that the average for the whole population will be a little lower than 
the maximum point on the scale. Those who are below the maximum 
get their true rating, but those who are above the maximum point on 
the scale get only the maximum possible rating. When these ratings 
are pooled into one average, it will, of course, be a little below the 
maximum point of the scale. 
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nd The Binet scale is extended from one year to the next by inserting 
en new tests on which the higher age groups do better than the lower 
of age groups. It should be extended beyond the age of 14 or 16 by 
ge inserting tests on which older subjects succeed better than younger 
'a- ones. It is difficult to find test questions of the ordinary type in which ; 
he such differentiation is possible, but our inability to find them does e 
I. not prove that the development of intelligence stops somewhere in . 
nd the ’teens. Common sense judgment certainly favors the assumption 

on that the average man of 40 is more intelligent than the average boy 

ng of 20, but so far we have not been able to measure that difference. 
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ps Instead of acknowledging this limitation in our measurement methods, 
ar we have not infrequently attempted to juggle with the definition of 
an intelligence to make it fit the measuring devices that are accessible. 
m The validity of the method here described depends largely on the 
on linearity of the plot of the X-values for any two adjacent age groups. | a 
gs In order to test the consistency of the method throughout the age range as 
he here considered, correlation tables were plotted for all the adjacent age | 
groups with results as shown in Table IV. It will be readily seen that Ei 
all the correlations are above +.97. If the distributions of test- | 








; 

he | 

Ri 4 

i 
a & 





444 The Journal of Educational Psychology 


intelligence for any of the age groups should not be normal, these 
plots would nc% be linear, and the correlations would be correspond- 
ingly reduced. The linearity of all the graphs is very conspicuous 
and similar to that of Fig. 3. 

The upper curve in Fig. 4 shows the growth in Binet test intel- 
ligence of those children who rank +1¢ with reference to children of 
their own age. The lower curve shows similarly the growth curve for 
those children who rank —1o with reference to children of their own 
chronological age. It is interesting to note that these curves tend to 
separate with advance in age. The interpretation of this fact is that 
the absolute variability of intelligence increases with age. 
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Chronological Age 
Fig. 5. 


The next step in our analysis is to ascertain the chronological age 
at which each test question is at par. In other words, what is the 
average age of the children, 50 per cent of whom pass a certain test 
question? In Fig. 5 we have represented the per cent of children of 
successive ages who pass a given test question. Curves are shown for a 
random sample of questions. It was not possible to include curves for 
all the 65 test questions on one chart, because of the lack of space. But 
the curves shown indicate the nature of the function. The numbers on 
the lines indicate the numbers of the Binet test questions. Inspection 
of Fig. 5 shows that the functions are similar for practically all test 
questions with some noticeable variation in the slope of the curve. 
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A refined statistical method for ascertaining the age at par for each 
test question would be to fit an equation to the curve for each test 
question and then to ascertain the point at which the curve intersects 
the 50 per cent level. This would be rather laborious and it would 
not appreciably increase the accuracy of the short-cut that is here 
adopted because there is some error in the original percentages 
themselves. 

In order to ascertain the age at which a particular test question 
is at par we draw a curve for that question similar to the curves in 
Fig. 5. We note the percentages immediately below and above 50. 
Call these percentages p: and pz respectively. Then the interpolation 
is represented by the following formula: 

‘i 50 — Pi 
Age at par = Y — 
in which Y = age which has P, correct answers. 

pi and p, are the two percentages immediately below and above 50. 

Applying this method to Question 8 in Burt’s tables as an example, 
we have 
50 — 45 _ 
83 — 45 


In Table V we have the test questions as listed by Burt together 
with the age allocation of each question by the method here described. 
The age allocation is so determined that a large random sample of 
children of the specified age at par would give approximately 50 per 
cent right answers to the particular question. Older children would 
give correct answers more often, and younger children would give 
correct answers less often. It was not possible to give a definite age 
allocation to some of the questions in the Binet tests at the lower end 
and at the upper end. The first few questions were so easy that the 
lowest age group in Burt’s study, the three-year-old children, gave 
more than 50 per cent right answers and the last few questions were 
so difficult that less than 50 per cent of the 14-year-old children gave 
right answers. Of course one could resort to make-shift in scaling 
these test questions but it must be recalled that the functions shown 
in Fig. 5 are not all parallel. Hence the advisability of finding 
experimentally those age groups which get close to 50 per cent of 
the questions right. 


Age at par = 3.5 + 3.6 
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TABLE [! 
og — 
age o age 0 
seven- eight- Difficulty | Difficulty 
year-olds | year-olds | for seven- | for eight- 
Description of Binet who who year-olds | year-olds 
questions answer answer 
correctly | correctly 
Pr Ps xX: Xs X72 
TN STETETOEE TT TS 99.2 99.6 —2.41 —2.65 5.81 
qemparing PR bsabeee 99.0 99.2 —2.33 —2.41 5.43 : 
OO RTT 98.4 99.2 —2.14 —2.41 4. ' 
DN cet 66s 66s daeewe 98.2 99.2 —2.10 —2.41 4. 5. 
10 syllables.............+. 99.6 99.2 —2.65 —2.41 Wa 5. 
chin kus adn wen aehue s 97.3 99.0 —1.93 —2.33 3. 5. 
orning and afternoon... 97.8 98.9 —2.01 —2.29 4. 5. 
a éccecdesebeeeeh 97.8 98.6 —2.01 —2.20 4. 4. 
SD. coctcaakadhece 96.1 98.9 —1.76 —2.29 3. 5. 
CO ee 92.2 96.4 —1.42 —1.80 2. 3. 
rr ee 95.3 98.6 —1.67 —2.20 2. 4. 
i ccrkennh omenion 96.6 99.0 —1.83 —2.33 3. 5. 
Dt -t.edn0eeoedull 95.6 97.6 —1.71 —1.98 2. 3. 
Transcription............ 94.8 99.0 —1.63 —2.33 me 5. 
DE Ss ccccésceede 93.9 95.6 —1.55 —-1.71 2. 2. 
ina a hg ioe & baa 94.0 97.2 —1.55 —1.91 2. 3. 
Divided card...........-. 86.2 92.2 —1.09 —1.42 S 2. 
Definition (use).......... 87.2 94.2 —1.14 —1.57 1. 2. 
DER, condaveeviccece 94.0 95.6 —1.55 —-1.71 2. 2. 
Picture (description)...... 84.7 93.6 —1.02 —1.52 1. 2. 
SS 85.3 95.2 —1.05 —1.66 # 3 
Right and left........... 83.8 90.1 — .99 —1.29 : 1. 
Missing features.......... 87.3 95.6 —1.14 —-1.71 1. 2. 
Pence and halfpence...... 80.5 90.4 — .86 —1.30 ‘ Be 
Differences (concrete)..... 70.6 80.0 — .54 — .84 ‘ ‘ 
ae as oil 67.3 87.2 — .45 —1.14 ’ 1. 
Reading (2 facts)......... 58.2 82.1 — .21 — .92 ; . 
Easy questions........... 51.7 76.5 — ,04 — .72 - 
Counting 20 tol......... 53.6 76.0 — .09 — .71 ‘ ‘ 
di nech un 6ae ede edais 36.4 71.4 + .35 — .57 ‘ ; 
die cembiee os daa 39.1 68.2 + .28 — 47 ? ‘ 
is. ibieeenweimincn 42.5 60.2 + .19 — .26 d : 
eS 35.3 61.6 + .38 — .29 ‘ 4 
7 SR eee 33.3 57.2 + .43 — .18 ; J 
Reading (6 facts)......... 19.3 44.5 +. 87 + .14 a i 
Definition (class)......... 23.6 40.4 + .72 + .24 ‘ ‘ 
thn ne weine oeed 21.0 36.1 + .81 + .36 ‘ ‘ 
Sentence building?........ 14.7 34.4 +1.05 + .40 1. d 
Memory drawing......... 9.8 28.9 +1.29 + .56 1. ‘ 
Absurdities.......ccccces 5.2 24.4 +1.63 + .69 2.66 ‘i 
Difficult questions........ 6.3 13.1 +1.53 +1.12 2.34 ; 
PP doddedaes stews ° 7.6 21.6 +1.43 + .79 2.04 ‘ 
Ee \ 5.8 18.8 +1.57 + .89 2.46 . 
Sentence building?......: 3.1 16.5 +1.87 + .97 3.50 ; 
SS a ae 3.4 20.2 +1.83 + .83 3.35 ‘ 
Mixed sentences.......... 0.9 15.3 +2.37 +1.02 5.62 | 1. 
Picture (interpretation)... 4.6 11.2 +1.68 +1.22 2.82; 1. 
peageeten Eicbeanae Sena e 6.9 23.6 +1.48 + .72 2.19 , 
tite tiiril.6 cae oh 0.9 2.2 +2.37 +2.01 5.62 | 4. 
26 syllables............. ' 1.8 4.3 +2.10 +1.72 4.41 | 2. 























1 The children were classified according to their last birthday. 
for convenience in the calculations. 
aederholm, G. A.: On the Measurement of Intelligence. 
April, 1923. 
urt, Cyril: ‘‘Mental and Scholastic Tests.”’ 


The ages are handled as integers 


Scandinavian Scientific Review, 
P. 8, King, London, p. 132. 
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TaBLeE II 

(1) 2X14 = +2€.23 Xe, = +13.68 | 2X7? = 113.55 
2X7_ = —40.87 EXe_ = —53.95 >X4? = 119.32 7 
2X; = —14.64 =X, = —40.27 | n = 50 i 
=X; —14.64 ae a. be 
(2) te. 7 = — — 293 | m;? = .0858 ‘i 

z= — 40. 
my = 248 = O27 _ _ 905 | ms? = .6480 
(3) a : mt = 4 f118:58 — .0858 = 1.478 1} 
= a 3X — mgm 4|119-52 _ 6450 = 1.318 
50 
(4) M7 = 4.06 (derived from similar calculations for the 6-7 age groups) 
o7 = 1.33 (derived from similar calculations for the 6-7 age groups) 
Ms; = a: m; - sms) + My; 
8 
“ 1.478 . 
Ms = 4.87 ae ‘ 
8 . 
(5) os = 07 =) a 
_ 1.83 X 1.478 i 
- 1.318 . 
os = 1.496 





It will be seen from Table V that the test questions are more numer- 
ous at certain ages than at others. For example, there are 12 questions Bi 
that scale at par between the ages five and six, whereas there are only 

| four questions that scale at par between six and seven. 

, If we desire to ascertain the scale value of each test question it 

| can be done best by noting the two percentages just above and just 
below 50 in curves like those of Fig. 5. Let these two percentages be i@ 
pi and pz as before, and let X; and X¢2 be the corresponding sigma 
values. Then we have for the scale value the following determination. 


Scale value of test question = 14[(M, + X101) + (M2 + X202)] 
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Applying this procedure to Question 40 as an example, we have the 
following numerical values: 


Pi = «391 Ps = .682 
X7 = +.277 Xs = -—.473 
M,= 4.061 _ Mg = 4.875 

o7 = 1.333 os = 1.496 


These values give a scale value of 4.3 for Question 40. The scale 
value of particular test questions may also be determined by interpola- 
tion between the scale values of the ages corresponding to p: and p>. 
These two methods give practically the same numerical results. This 
latter method was used for the determination of the scale value of each 
test question. The result is shown in Fig. 6 which also brings out 
rather strikingly the fact that the questions are unduly bunched at 
certain ranges and rather scarce at other ranges. 


SUMMARY 


An absolute method of scaling tests is here proposed which assumes 
that the distributions of ability in the several age or grade groups 
are normal but which allows freedom of variation for the means and 
for the standard deviations of the several age groups. The method is 
illustrated chiefly by Figs. 2 and 3 and TablesI and II. The type of 
result that is obtained by the method is illustrated by Fig. 4 and 
Table IIT. 

The particular results of the method as applied to Burt’s data for 
the Binet tests are of secondary value in this article but of some 
interest. Figure 4 shows that test-intelligence grows nearly as rapidly 
at the age of 14 as it does at the age of 9. This finding is not consistent 
with some current notions about so called ‘‘adult”’ intelligence at 
the age of 14. It may be that this curve, if continued, would drop its 
acceleration to reach a limit in the early 20s or perhaps even at the 
age of 20 but it can hardly be extended to reach a limit much sooner 
than that. 

It is also found that the absolute variability of test-intelligence 
increases noticeably with age. For example, the variability of test- 
intelligence for children at 14 is nearly twice that of children at three. 

The application of the present mevhod of scaling to Binet test data 
shows that the distributions of intelligence for children can be assumed 
to be normal at least as far up as the age of 14. 
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TABLE IV.—CoRRELATION BETWEEN S1GMA VALUES OF QUESTIONS FOR ADJACENT 


9% 
1044 
114 
1244 
1344 
1416 


AGES 
34- 4% 
4\4- 5 
54- 64 
6}3- 74% 
74- 8% 
8}2- 944 
944-104 

104%-11% 
114-12% 
12%-13% 
1344-14% 
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TaBLeE III 


M 
0 


1.231 
2.336 
3.372 
4.061 
4.875 
5.433 
6.190 
6.776 
7.314 
8.040 
8.684 


AGES 


NUMBER OF 


OVERLAPPING QUESTIONS 


26 
33 
41 
48 
50 
44 
31 
30 
26 
25 
23 


Qq 


.191 
. 255 
.255 
.333 
.496 
. 500 
453 
.446 
.503 
. 532 
. 792 


ee ee ee eee ee 


+ .97 
+ .97 
+.98 
+ .97 
+ .99 
+ .99 
+.97 
+ .99 
+ .99 
+ .98 
+ .98 
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TABLE V 
Binet test Binet test 
questions Year at par questions Year at par 
(Burt’s data) (Burt’s data) 

8 3.6 36 7.3 

i) 3.8 37 7.4 
10 3.7 38 7.4 "i 
11 4.0 39 7.9 ds 
12 4.1 40 - 7.9 é 

13 4.1 41 7.9 

14 4.5 42 8.1 

r 15 4.4 43 8.2 

16 4.5 44 8.7 

17 4.5 45 8.9 
18 4.7 46 9.3 ; 
19 4.9 47 9.7 
20 5.1; 48 9.8 . 
21 5.4 49 10.5 ; 
22 5.5 50 10.6 i 

23 5.6 51 10.9 
24 5.7 52 10.9 in 
25 5.5 53 11.0 4: 
26 5.6 54 11.4 rs 
27 5.7 55 11.2 i 
28 5.8 56 11.6 b 
29 5.5 57 12.0 ‘ty 
30 5.6 58 12.9 ss 
31 5.5 59 13.4 
32 6.0 60 13.3 i 
33 6.7 3 
35 6.7 t 
BA, 
as 
i 
mee 











A PRELIMINARY STUDY OF A TEST FOR SOCIAL 
PERCEPTION 


GEORGINA STICKLAND GATES 
Barnard College, Columbia University 


In a recenié article,! the writer published the data obtained when 
458 school children, ranging in age from 3 to 14, and 36 adult women 
were asked to interpret 6 pictures from a set published by Ruckmick? 
representing laughter, pain, anger, fear, scorn and surprise. The 
children were tested individually,* and were asked in the case of each 
picture ‘‘What is this lady doing?” ‘‘What is this lady thinking 
about?”’ ‘‘How does she feel?” Tabulation of the replies, making 
use of a somewhat liberal scoring method, showed a gradual increase 
in ability to interpret each of the pictures as the children advanced in 
age, or in school grade, and demonstrated that the laughter picture was 
understood by more than half of the children whose age at last birth- 
day was 3, pain at age 6, anger at 7, fear at 10, surprise at 11, while 
scorn was described correctly by only 43 per cent of the 11 year old 
children. 

In the present article an attempt is made (1) to demonstrate the 
changes which occur when we score the test as a unit, 7.e., tabulate 
the percentage of children of various ages and grades who were able to 
interpret adequately none, one, two, three, four, five or six pictures, 
and (2) to discover the relationship which exists between the present 
test and other measures; chronological and mental ages, physical 
measurements, social, emotional, and intellectual maturity. 

Tables I and II present the data on the first problem. Any sub- 
ject’s score may be compared with those of his age and grade by the 
use of these tables. It is evident that the ability represented by this 
test increases gradually with age and grade, though it is immediately 
apparent that this increase is not uniform throughout the series. 
Uniformity can, perhaps, be best achieved by averaging each two adjac- 
ent age and grade means and obtaining then for ages 3-4, 1.65; ages 5-6, 
1.95; ages 7-8, 2.50; ages 9-10, 3.80; ages 11-12, 4.5, and for the 
kindergarten and Grade I, 1.80, Grades II and ITI, 2.40, Grades IV and 
V, 3.85, Grade VI, 4.70. The extreme age groups (see Table I) are 





1 Gates, G. S.: An Experimental Study of the Growth of Social Perception. 
Journal of Educational Psychology, November, 1923. 


2 Ruckmick, C. A.: Psychological Review Monograph, No. 136, 1921. 
8 The writer is indebted to Miss Margaret Miller for administering these tests. 
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obviously not adequately represented both because the number of 
cases in these groups is small and because of the method of selection 
employed. The three-year-old children in the kindergarten are 
probably brighter than the average children of that age and the 
thirteen-year-old children tested are doubtless duller. The testing of 
more representative children in the extreme age groups is needed. 


TaBLE I1.—PrERCENTAGE OF CHILDREN oF Various Aces GrvING FROM 0 To 6 
CorreEcT INTERPRETATIONS 



































Number of 
correct oliailialsti|«l|s | 6 | Aves) Number 
. . score of cases 
interpretations 
Ages 

3 30} 30; 10} 20; 10; 00; OO 1.7 10 

4 20; 25/ 35] 17; 038] OO}; OO 1.6 40 

5 14; 29; 31] 25; OL; OO}; OO 1.7 85 

6 02; 31; 31] 24] 10; 02; OO 2.2 58 

7 04; 24] 32] 24{] 12) 04; OO 2.3 50 

8 04; 15] 19} 42; 10; 10}; 00 2.7 52 

a) 03} 03; 18] 48{] 23); 05; 00 3.0 39 

10 00; OO; OO; 21; 29; 21) 29 4.6 28 

11 00; 02; 05; 10| 25| 35| 23 4.6 44 

12 00; OO; 04; 19] 25} 33); 19 4.4 27 

13 00; 06; #2) 23) 23] 23) 12 3.8 17 

] 
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TaBLeE I].—PERCENTAGE OF CHILDREN OF VARIOUS GRADES GIVING FROM 0 To 6 
Correct INTERPRETATIONS 








Number of correct olilel3alalsie Average | Number 

interpretations score of cases 
Kindergarten (Group II)....... 22) 27| 27| 22) 02 ool 00; 1.5 67 
Grade I (Group II)............ 00} 28] 42) 23) 05) 02] 00) 2.1 43 
Grade II (Group II)........... 03} 32} 28} 26) 09} 02) OO} 2.1 58 
Grade III (Group II).......... 05} 11) 27) 34; 15) 08) 00] 2.7 60 
Grade IV (Group II).......... 00 09 05| 62| 26] 07} 00| 3.4 42 
Grade V (Group III).......... 00} 00} 00} 17] 35| 26) 22} 4.3 23 
Grade VI (Group III)......... 01} 01) 04) 05) 28) 37) 25) 4.7 81 
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Since the average frequently falls within the same step on the scale for 
different age and grade groups, the addition of more pictures is 
suggested. 


TaBLeE II]].—PartT1aL CORRELATION OF TEST OF SOCIAL PERCEPTION WITH OTHER 
MENTAL AND PHysicAL MEASUREMENTS, WITH CHRONOLOGICAL AGE ELIMI- 
NATED (46 CasEs) 








Test of | Ossified 
social Height MA Composite 
: area 
perception 

Test of social perception. . ie .09 .30 .10 .30 
Ossified area............. .09 ve 44 .12 .33 
li iw is wine ohhaien 30 44 ne .40 42 
ie cs 0g Sak Qed .10 .12 .40 ro .43 
0 ee .30 .33 .42 43 




















TaBLE 1V.—CoRRELATION OF TEST OF SocraAL PERCEPTION WITH MENTAL, 
PHYSICAL AND SociaL Traits (24 CasgEs) 

















‘2 

SBISE/SE/_5/B5| 25 

SelERISEI SEIS ae S] 5 
Test of social perception...| ... 26 .22 .42 4 .50 | .31]) .12 
Physical maturity......... . 26 ee .49 .61 .36 .51 | .26] .29 
Mental maturity......... .| .22 .49 ry 91 .61 .76 | .O7| .54 
Social maturity........... .42 .61 91 par .59 .80 | .08] .37 
Scholastic maturity.......| .60 | .36] .61 FS hte .78 | .07| .34 
Emotional maturity....... .50 .51 .76 .80 .78 oe 
Dae dod axes avtwehde 31 . 26 .07 .08 .07 . foe 
RS Ae ees .12 .29 54 .37 .34 .52 | .02 





























In Tables III and IV the correlations of this test with certain other 
measures are given. In each case the children of the kindergarten 
of a private school were used as subjects; in Table III, 46 subjects are 
included; for Table IV only 24 cases were obtained. 
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The measures used in Table III are: 

1. (CA) Chronological Age in years and months. 

2. (MA) Mental Age as determined by the Stanford Revision of 
the Binet-Simon Scale. 

3. Height. 

4. Ossified Area. The area of the wrist bones which shows, under 
the x-ray, ossification. Photos of both wrists were taken. The 
measure is the sum of the two. This measure has been proposed as an 
index of general anatomical and physiological maturity.! 

5. Composite Score, consisting of teachers’ estimates of scholastic 
and mental maturity combined with the Stanford Binet Mental 
Age; each trait given equal weight. 

Since the kindergarten children tested varied in age from three 
years and one month to six years and one month, and since these age 
differences are obviously factors in increasing the size of correlations, the 
influence of chronological age (which has a correlation of .20 with the 
present test in this group) has been eliminated by the technique of 
partial correlations. The test appears to have a negligible relation 
to mental age (.10) or to the area of the ossification of the wrist bones 
(.09) and a small positive relation to height (.30) and to the score made 
by combining teachers’ judgments of maturity with mental age (.30). 
The correlation with height appears to the author to be, perhaps, 
merely additional evidence of the danger of inferring too much from the 
relative size of small coefficients, since it seems probable that all these 
correlations, were they accurately determined, would average about 
20, the usual correlation found between unrelated desirable traits. 
The relation with the composite is explicable either in the same manner 
or in the light of the results of Table IV, for we may assume that the 
teachers’ estimate made social as well as strictly intellectual maturity 
a factor in determining a score in this particular ranking. 

In Table IV, an attempt is made to eliminate the difficulties due to 
chronological age by the selection of subjects rather than by statistical 
methods, by omitting the very young and relatively older children and 
restricting the age range to about one year and a half—from age four 
years, four months to age five years, ten months. The original 
correlations are given since eliminating the effect of chronological age 
reduces only one correlation (that between the test of Social Perception 
and Physical Maturity from .26 to .22) more than two points. 





1 See Woodrow, H.: ‘ Brightness and Dullness in Children.” 
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Included here were other measures more closely related to the func- 
tion which it was our purpose to test by means of the pictures. Beside 
Chronological and Mental Ages, in Table IV there occur scores which 
represent the average judgment of five teachers acquainted with the 
children, data collected in connection with a separate problem by 
different experimenters. The teachers were asked to score the pupils 
on a scale from 1 to 10 (10 being the highest), rating separately Physi- 
cal, Mental, Scholastic, Social, and Emotional Maturity, and keeping 
in mind in making each judgment, certain criteria which serve to 
differentiate these traits. 

The definitions given to the teachers may be summarized in the 
following statement. Physical Maturity was listed as including good 
health throughout the school year, a large amount of physical energy 
as shown by “pep” in work or play and enjoyment of exercise, and by 
ability to work or play without strain. Mentai Maturity was judged 
by considering the pupil’s ease of understanding of class discussions 
and directions, his asking of intelligent questions, and by the apparent 
developing of intellectual interests and a sense of responsibility. 
Evidence of Social Maturity was found in the child’s ability to cooper- 
ate, to get along with his fellows and superiors, to hold his own without 
being too aggressive, to participate in class activities, to respect disci- 
plinary regulations, the property of others, personal hygiene, and to 
exercise a wholesome influence in the classroom through his leadership. 
Scholastic Maturity was shown mainly by eagerness for work, desire 
to learn, the volunteering of information, the exercise of initiative, 
and the ability to stick to a problem or project until it was completed. 
As criteria for judging Emotional Maturity were mentioned the ability 
to keep from crying when hurt or disappointed, the disappearance of 
babyish reactions in speech or interests, the evenness and normality 
of emotional responses (7.e., the child is not overstimulated by a new 
situation nor does he fail to give the expected emotional reaction). 

Examination of the criteria shows a certain ambiguity and over- 
lapping of the abilities which were to be measured and suggests that 
the name given to one, at least, of the kinds of maturity studied, may 
be misleading. Perhaps by Scholastic Maturity we mean eagerness, 
sustained interest, constancy rather than what the name usually 
implies. The high correlations between the judgments are evidence 
not only of the positive relation found between desirable traits, and 
of the “‘halo effect,” but of a certain looseness of definition. It seems 
that we might group these estimates in three classes, two of which are 
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merely judgments of Physical and Mental Maturity as these concepts 
are frequently defined; the third includes capacities which are neither 
strictly physical nor strictly mental. This third group includes 
Social Maturity which is an ability to cooperate with, to get along with 
one’s fellows and superiors, to show the most approved forms of gre- 
gariousness, self-assertion, and submission; Scholastic Maturity which 
is mainly a temperamental capacity, an eagerness to learn and an 
ability to stick to a problem or project, a blend of curiosity and aggres- 
sive reactions to things, and Emotional Maturity which is the ability 
to control one’s emotions. 

It is immediately apparent that the test of social perception as 
given, correlates less well with chronological or mental age (the 
coefficients being .31 and .12) or with the judged physical or mental 
maturity of the subjects (the relation here is represented by the figures 
.26 and .22) than it does with the estimates of the more vaguely defined 
traits of Class III, ability to cooperate (coefficient .42), to be interested 
in and stick to a project (.60) and to exercise control of one’s emotional 
reactions (.50). The first and the last of the third group are obviously 
social traits, the second might possibly be so considered when we take 
into account the environment of the kindergarten child whose eagerness 
for and constancy in work is evidenced not by the pursuit of researches 
unaided as is frequently the case with the adult worker, but by ability 
to cooperate in projects which are almost always dependent upon the 
approval, more or less subtly expressed, of others. 

It might be urged (if correlations based on so few cases may be 
utilized in support of any contention) that ability to interpret the 
facial expressions of others (as measured by this test) is, with these 
young children, a factor in social cooperation, in demonstrated eager- 
ness for, and continued interest in projects, and in desirable emotional 
control. Perhaps a better statement of the findings would be the 
more conservative conclusion that some evidence has been presented 
to show that the present test is not merely another measure of chrono- 
logical or mental age, but that it gauges, in some measure, traits more 
definitely social in character. 
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CORRELATIONS IN PHYSICAL AND MENTAL 
GROWTH 


ETHEL M. ABERNETHY 
Queens College 


Part I 


During recent years there has been much interest in the question 
of a probable relationship between physical and mental development. 
The problem is of practical as well as of theoretical significance. In 
its application it is related to the everyday administration questions 
of classification and promotion of pupils in the schools. Those who 
hold that there is marked positive correlation between mental growth 
and certain indices of physical maturity have proposed that grouping 
for purposes of instruction should be based, in part at least, upon 
physiological or anatomical age. Woodrow! represents those who have 
urged the theory that knowledge of anatomical age is useful in the 
diagnosis of a child’s mental ability and in planning and regulating 
his education. 

Aside from the difficulty of measuring mental development, the 
problem of determining the relationship between physical and mental 
growth stages is complicated by lack of agreement regarding the selec- 
tion of criteria of physiological and anatomical maturity. Bean? 
and Beik* have attached much importance to the eruption of the per- 
manent teeth as an index of growth—both physical and mental. 
Other investigators, notably Woodrow and Lowell,‘ Baldwin and 
Stecher,> Freeman and Carter,* Dearborn and Prescott,’ have accepted 





1 Woodrow, Herbert: ‘‘ Brightness and Dullness in Children,”’ 1919, Chapter 
VI. 

2 Bean, R. B.: The Eruption of Teeth as a Physiological Standard for Testing 
Development. Pedagogical Seminary, XXI, 1914, pp. 596-614. 

3 Beik, A. K.: Physiological Age and School Entrance. Pedagogical Semi- 
nary, XX, 1913, pp. 277-331. 

* Woodrow, H. and Lowell, F.: Some Data on Anatomic Age and Its Relation 
to Intelligence. Pedagogical Seminary, XXIX, March, 1921. 

5 Baldwin, B. T. and Stecher, L. I.: ‘‘ Mental Growth Curves of Normal and 
Superior Children.”’ University of Iowa, 1922. 

6 Freeman, F. N. and Carter, T. M.: A New Measure of the Development of 
the Carpal Bones and Its Relation to Physical and Mental Development. Journal 
of Educational Psychology, May, 1924. 

7 Prescott, D. A.: ‘‘The Determination of Anatomical Age in School Children 
and Its Relation to Mental Development.”’ Harvard University, 1923. 
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skeletal development as the best index of anatomical age and the 
ossification process in the wrist as the most reliable measure of skeletal 
growth. As yet, however, there is lack of uniformity of method in 
the measurement of carpal development. Crampton,! Godin? and 
Baldwin* are among those who have emphasized the importance of 
pubescence as a criterion of physiological and mental maturity. 
Height and weight are evidently unsatisfactory measures of develop- 
ment because of the difficulty in predicting ultimate size. More 
extensive studies by the method of intercorrelation seem to be needed 
to reveal the relationship between the various indices of physical 
growth and to determine the validity of the commonly accepted 
criteria of physiological and anatomical maturity. 


THE ReEsuuts or PREvious INVESTIGATIONS OF THE PROBLEM 


Recent investigation of the relationship between carpal develop- 
ment and mental age have agreed in finding very low correlations when 
chronological age is a constant factor. The size of the coefficient 
of correlation seems not to be greatly affected by the method of measur- 
ing carpal development—whether the method be subjective or mathe- 
matically exact and quantitative; whether it be measurement of total 
carpal area alone, or of the ratio of the ossification to the carpal 
quadrilateral. Woodrow and Lowell‘ based their judgment of carpal 
development mainly upon general impression and found a positive but 
low correlation with intelligence, the coefficients varying from .12 
to .41. Baldwin® used an objective method of measuring the total 
carpal development. His partial correlations gave no indication of 
relationship between ossification and mental age. Freeman and 
Carter® of the University of Chicago, and Dearborn and Prescott’ 
of Harvard University, have adopted a more discriminating measure 
of carpal growth through the use of a ratio which takes into considera- 
tion the size of the hand. Carter’s correlation between the mental 





1Crampton, C, W.: ‘‘ Physiological Age—A Fundamental Principle. Ameri- 
can Physical Education Review, XIII, 1908, Nos. 3-6. 
? Godin, Paul: ‘‘Growth during School Age.’”’ (Translation) Boston, 1920. 
* Baldwin, Bird T.: ‘‘The Physical Growth of Children from Birth to Matur- 
” University of Iowa, 1921. 
* Woodrow, H. and Lowell, F.: Op. cit. 
5 Baldwin, B. T. and Stecker, L. I.: Op. cit., pp. 56-57. 
. ® Freeman, F. N. and Carter, T. M.: - cit. 
7 Prescott, D. A.: Op. cit. 
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age and this ossification ratio, when chronological age was held 
constant, gave zero coefficients (+.084 for boys, +.088 for girls). 
Prescott also has reported very low partial correlations. For four 
groups studied the coefficients are, respectively, —.05, +.12, +.30, 
+.33. 

Correlations between mental age and the other indices of physical 
development have also failed to show any decided relationship between 
mental and physical growth. In the investigation of Woodrow and 
Lowell, the number of permanent teeth showed slightly less correlation 
with intelligence than did the degree of carpal development. The 
results of their study indicated a negligible relationship between height 
and intelligence. In general, the investigations cited have tended 
to show that rate of mental development is fairly independent of rate 
of growth in physical traits. An exception may be noted in the marked 
partial correlation (+.53) which Baldwin found between height and 
mental age from a study of 49 girls. 

Higher correlations between various types of physical growth might 
be expected. Baldwin and Stecher found the correlation between 
physical traits only slightly influenced by keeping chronological age 
constant. From their study, partial correlations gave a coefficient of 
+.62 between carpal development and height. Woodrow and 
Lowell, however, found very low correlations between height and 
carpal development and between the degree of carpal development and 
the number of permanent teeth. 


ScoPpE AND METHOD OF THE PRESENT STUDY 


Sources of Data.—The present investigation was limited to a study 
of the physical and mental measurements of 359 girls in the University 
of Chicago Laboratory Schools.2 Data for the study were secured 
from the files of these schools. On the birthday of each pupil complete 
physical and mental measurements are made and recorded. As has 
been stated, the method of measuring the ossification process which 
has been developed in the laboratory schools offers the advantage of a 
ratio between the carpal quadrilateral and the total ossification which 





1 Baldwin, Bird T. and Stecher, L. I.: ‘‘Mental Growth Curves of Normal 
and Superior Children,” p. 57. 

2? The writer wishes to acknowledge her indebtedness to Dr. G. T. Buswell, 
of the University of Chicago, for interest in this problem and for many helpful 
suggestions throughout the course of the investigation. Acknowledgement 
of many courtesies extended is due also to Dr. F. N. Freeman, Dr. Karl Holzinger, 
and the officials and teachers of the University of Chicago Laboratory Schools. 
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discounts the influence of the general size of the skeleton upon the size 
of the carpal bones. This ossification ratio would seem to make pos- 
sible more reliable comparisons between individuals. For the 239 
girls of secondary-school age it was possible to secure records of 
height, weight, age of physiological maturation as indicated by pubes- 
cence, and measures of carpal development. Scores from the Otis 
Higher Examination were available for these pupils. For the 120 girls 
of the University Elementary School the data included records of 
height, weight, number of permanent teeth, carpal development, and 
the Stanford-Pinet mental age. 

Method of Statistical Study.—It was decided to approach the study 
of relationship between mental and physical growth by the method of 
correlation, with chronological age as a constant factor. In a study 
of the data for high-school girls chronological age was held constant 
by dividing the pupils into five age groups from the thirteenth year 
through the seventeenth. As the measurements for each girl had been 
made upon her birthday, the factor of chronological age was thus 
entirely eliminated from the correlation. The ages of the 120 elemen- 
tary-school girls ranged from 6 years through 12. For this group 
chronological age was held constant through the use of the following 
formula for partial correlation: 


ss Ti2 — T3123 

V1 = ris?V1 — 723? 

Method of Analytic Study.—The investigation was carried further 
by an analytic study of 38 cases exceptional in age of physiological 
maturation—or pubescence. Distribution was made of the ages of 
maturing of 487 girls enrolled in the University of Chicago High School. 
The average age of maturing was found to be 13 years,6 months. An 
intensive study was made of every type of school record relating to 


individuals so far above or so far below the average as to seem excep- 
tional in age of pubescence. 





T12~3 





RESULTS OF A STATISTICAL STUDY OF THE PROBLEMS 


In this study, interest is centered in the relation of pubescence, or 
physiological maturation, to general physical and mental growth, and in 
the correlation of mental age with the generally accepted standards of 


anatomical development. The results of the statistical study in 
tabular form follow. 
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Mental Age and Ossification. Ratio—Carpal growth has been 
selected by many as the most reliable standard of anatomical maturity. 
Table I shows the result of correlating this index with measures of 
intelligence for the six groups studied. 


TaBLE I.—CoRRELATION OF MENTAL AGE WITH OSSIFICATION RATIO 








Chronological Number of o PE of r 
age cases 
6-12 120 + .016 .061 
13 44 — .137 . 099 
14 62 — .139 -084 
16 45 — .022 -10 
17 37 + .041 ll 














These data indicaie that when the factor of chronological age is 
eliminated there is practically zero correlation between carpal develop- 
ment and mental age. 

Mental Age and Dentition—For the 120 girls under 13 years of 
age, records of permanent teeth were available. The correlation of 
dentition with the Stanford-Binet scores is slightly negative (— .12 + 
.06) with chronological age constant. When the age factor is not 
eliminated, mental age shows a higher correlation with chronological 
age (+.82) than with dentition (+.64). 

Mental Age and Height.—Table II shows low or zero correlations 
between height and mental development. The correlations are in 
all cases positive, but negligible or low even for the younger groups. 


TABLE II].—CoRRELATIONS OF MENTAL AGE WITH HEIGHT 








Chronological Number of . PE of r 

age cases 

6-12 120 + .34 .05 
13 44 + .009 .10 
14 62 + .065 .08 
15 29 +.111 123 
16 45 + .023 .10 
17 37 + .245 .103 
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Mental Age and Weight.—For the groups studied, weight has about 
the same correlation as height with mental age. 1n either case, the 
highest correlation is for the elementary school girls; but even for this 


group the relationship is not marked (height and mental age + .34 + 
.05; weight and mental age + .39 + .05). 


TaBLeE III].—CorRELATIONS OF MENTAL AGE WITH WEIGHT 

















Chronological Number of . PE of r 

age cases 

6-12 120 + .39 .05 
13 44 — .063 .10 
14 61 + .099 .085 
15 29 + .150 .12 
16 45 + .209 .096 
17 37 + .175 .107 





Mental Age and Age of Maturing.—For each chronological age group 
above the thirteenth year, wide range was found in age of pubescence. 
Each group was studied separately and correlations were made between 
precocity of maturing and mental age. 


TABLE IV.—CoRRELATIONS OF PRECOCITY OF MATURING WITH MENTAL AGE 








Chronological Number of . PE of r 
age cases 
| 
14 45 + .021 .10 
15 27 + .029 129 
16 44 — .086 .10 
17 33 — .325 104 














From these data there is no evidence that early maturing favors 
more rapid mental development. This finding agrees with the results 
from a comparison of 23 prepubescent girls 13 years of age with 12 
postpubescent girls of the same chronological age. 

The central tendencies for the two groups show marked superiority 
in physical development for the girls physiologically mature. The 
median IQ, however, is one point in favor of the prepubescent girls. 
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TaBLE V.,—CENTRAL TENDENCIES IN PHYSICAL AND MENTAL DEVELOPMENT FOR 
21 Grrts MatTuRE AND 23 Gririts ImmaTurRE aT AGE 13 








Median Median Median Medi 
ossification height, weight, : a 
ratio inches pounds Q 
NS Oe es eee eo 1.03 61.3 102.0 113 
eee ere 1.00 59.1 90.4 114 

















Age of Maturing and Ossification Ratio.—Table VI shows the rela- 
tionship between age of pubescence and rate of carpal growth for four 
chronological age groups. 


TaBLE VI,—CorRRELATION OF PRECOCITY OF MATURING AND OSSIFICATION RATIO 








Chronological Number of 
age oti r PE of r 
14 47 + .562 .067 
15 27 + .467 .10 
16 44 + .093 .10 
17 33 + .24 ll 














The correlation of .562 for the 47 girls of age 14 is pronounced. 
The lower correlations with approach of maturity are to be expected. 
Age of Maturity and Height.—It is sometimes stated that early 
maturity tends to greater ultimate height. 


Table VII do not support this conclusion. 


The data presented in 
For the girls 16 and 17 


years of age the correlation of height with precocity of maturing is 
negative but too low to be significant. 


TaBLE VII.—CorRRELATION OF PRECOCITY OF MATURING AND HEIGHT 











Chronological Number of | 
age cases . PE of r 
14 46 + .128 .097 
15 27 + .147 .126 
16 44 - 172 | .098 
17 33 — .071 .116 
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Age of Maturing and Weight.—With weight, as with height, ulti- 
mate size does not seem to be influenced by the age of pubescence. 


There is a marked lowering of the coefficients with advance in age and 
approach to cessation of growth. 


TaBLE VIII.—CoRRELATION OF PRECOCITY OF MATURING AND WEIGHT 








Chronological Number of 
age cases . PE ofr 
14 46 + .428 .081 
15 27 + .368 .112 
16 44 + .049 .101 
17 33 + .003 .117 

















Dentition and Ossification Ratio.—The data for the 120 elementary 
school girls give a positive but low correlation between rate of carpal 
growth and rate of eruption of permanent teeth. By the method of 
partial correlation, the coefficient is +.305 + .055. A more decided 
relationship might be expected between these two measures of anatomi- 
cal growth. 

Ossification Ratio and Height.—No uniform or very marked corre- 


spondence between height and ossification ratio was found, as may be 
seen from Table IX. 


TaBLeE IX.—CoRRELATIONS OF HEIGHT AND OSSIFICATION RATIO 








Chronological Number of 

age onsen r PE of r 

6-12 120 + .35 .05 
13 45 + .201 .107 
14 63 + .318 .076 
15 29 —.105 .123 
16 45 +.01 10 
17 37 + .227 .105 

















The low correlation of height with ossification ratio may be due to 
the fact that such a ratio discounts the influence of the general size of 
the individual upon the size of the carpal bones. The total ossifica- 
tion would probably show more decided correlation with height. 
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SUMMARY 


1. The present study agrees with similar investigations in the find- 
ing of very low partial correlations between mental age and the 
commonly accepted indices of physiological and anatomical 
development. 

2. In this investigation zero or very low negative correlations are 
obtained between mental age and the ossification ratio. For the six 
groups studied the coefficients are, respectively, +.016 + .061, —.137 + 
.099, —.1389 + .08, —.174 + .12, —.002 + .10, +.041+.11. 

3. Precocity of maturing also fails to show correlation with intelli- 
gence as measured by mental tests. Forfour groupsthe coefficients are 
as follows: +.021 + .10, +.029+ .129, — .086 + .10, —.325 + .104. 

4. Between mental age and height the coefficients are zero or low. 
The highest correlation (+.34 +.05) is found for 120 girls under 13 
years of age. 


5. Weight shows approximately the same relation as height to 
mental age. | 


6. Records for 120 girls give a very low negative correlation (—.124 
+ .06) between mental age and dentition. 

7. The most marked correlations from the study are between preco- 
city of maturing and the ossification ratio: +.562 + .067 for the 14- 
year group, and +.467+.10 for the 15-year group. Lower coefficients 
are found at the years of approach to cessation of growth. 

8. From these data there is no evidence that ultimate height and 
weight are influenced by the age of maturing. 

9. The correlation of dentition with the ossification ratio is positive 
but low (+.305 + .055). 

10. Correlations of height with ossification ratio are low, but posi- 
tive except at age 15. The low relationship found may be due to the 
fact that the ossification ratio tends to discount the influence of 
general skeletal size upon the size of the carpal bones. Baldwin’s 


correlations between height and total ossification gave a much higher 
coefficient. 


(To be Continued) 








RELATIVE DIFFICULTY OF NUMBER COMBINATIONS 
IN ADDITION AND MULTIPLICATION 


WILLIAM H. BATSON 


University of South Dakota 
AND 
OLIN E. COMBELLICK 
State Normal and Industrial School 
Ellendale, North Dakota 


Within the past few years a number of treatises have been written 
dealing with the time element in arithmetical calculations. Some 
of these have appeared as articles in educational periodicals while 
others have taken the form of texts on methods of teaching arithmetic, 
or of guides to administering of drill and test exercises which are 
intended to cultivate speed and accuracy in the use of numbers. 

Some of the newer texts set the time in which a problem or set of 
problems should be worked by such directions as: ‘‘Give the answer 
to the following in three minutes or less.’”’ In one text 100 single- 
column addition problems of two or three numbers each is given.' 
In another? is found the statement, ‘“‘Here are 20 of the most difficult 
of the 45 elementary combinations in addition, try to name all the 
sums in less than 15 seconds,”’ and after a few pages this text sets 
another standard for attainment by saying, ‘‘These are the most 
difficult of the 78 multiplication combinations. Can you name the 
correct products in less than 30 seconds?”’ 

Guhin,*? in his “Practical Methods of Teaching the Primary 
Number Combinations,” sets 50 seconds as a standard for giving any 
36 of the addition or multiplication combinations. This would seem 
to imply that all sets of combinations are equally difficult and also that 
the processes of addition and of multiplication are equally difficult. 

One of the widely used standard tests allows eighth grade pupils 
three minutes for writing the results for 100 combinations in arithmetic, 
either addition or multiplication, while another test which has appeared 
more recently, gives 115 seconds as the median time for an eighth 
grader to do the same work. 

Such statements naturally lead one to question which goal is 
within the reasonable possibility of attainment and whether those 
who have passed through the schools really possess such facility with 
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numbers as these standards demand. But being unable to find any 
definite information as to whether the combinations in a given process 
are equally difficult, or whether it takes more time to perform the 
operations in one of the fundamental processes than it does to perform 
those in another process with the same numbers, we determine to seek 
for reliable information along these lines. 

It was at first our intention to make a study of addition, subtrac- 
tion, multiplication and division, but it very soon became evident 
that time would not permit us to make an intensive study of all four 
processes. It may be asked why we chose addition and multiplication 
instead of subtraction and division. As answer it will be sufficient 
to quote from David Eugene Smith of Columbia University who, 


in his ‘‘ Progress of Arithmetic,’”’ says: ‘“‘One of the latest investiga- | 


tions, and one that was very carefully made, shows that in the compu- 


tations connected with 25 different industries 53 per cent of the | 


operations were in addition, 41 per cent were in multiplication, 4 per 
cent were in division, and only 2 per cent were in subtraction.” 


THE PROBLEM 
ee _ 





The study was undertaken to ascertain as nearly as possible by 
mechanical and mathematical means the median time in which trained 
individuals would respond correctly to any of the simple combinations 
in addition or multiplication. In most studies of number combinations 
it appears that speed and accuracy have been assumed to be measures 
of difficulty. This assumption is accepted in this paper. Further, 
no attempt is made to examine any of the arithmetical processes 
other than the two figure combinations in the two processes mentioned 
above. The principal steps in the study have been the assembling 
of the apparatus, persuading the individuals to take the tests, trans- 
lating of the kymographic records into Arabic numbers and, last 
but by no means least, tabulating and compiling of approximately 
50,000 reactions upon the several combinations. 


THE APPARATUS 


The physical apparatus used was somewhat complicated, and in 
order that the reader may better understand how the tables of figures 
that appear later were obtained it is deemed best to describe the 
apparatus used for taking the reactions. 
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Figure 1 is a diagram of the apparatus as it was assembled. For 
the benefit of those who may not know the kymograph K it may be 
explained that it is a strong metallic frame in whose base is a clock 
spring which revolves the drum D. Around this drum, which is about 
20 inches in circumference, was wrapped a piece of smoked paper. 
The time indicator upon a standard to the right was so set that its 
points rested lightly against the smoked paper on the drum. The 
time indicator had two points, each of which was supplied with a 
small electro-magnet. These electro-magnets were connected with 
two separate electric batteries placed on the shelf beneath. The 
wiring for the upper of these magnets was such that the circuit was 
opened and closed by the action of the small victrola V, placed at the 




















Fic. 1.—Diagram of apparatus. 


left of the kymograph. With the aid of a stop watch, the victrola 
was regulated so that the circuit was closed every two-tenths of a 
second. This caused the upper pointer of the time indicator to mark 
downward every two-tenths of a second on the smoked paper upon 
the drum of the kymograph. 

For the lower pointer of the time-indicator an entirely independent 
circuit was made by connecting its magnet with a second battery 
and extending the connection to a copper plate which was fixed to a 
frame that rested on a nearby table. The other wire of this circuit 
was passed through the barrel of a fountain pen and the end soldered 
to the pen-point so that whenever the pen-point touched the copper 
plate this circuit was completed. Thus the lower pointer of the time 
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indicator recorded the writing of the numbers while the upper pointer 
was recording the tenths of seconds used. 

Figure 2 shows a kymographic record which has the upper and the 
lower lines marked to show what was recorded. The second and 
third lines are just as they were made by the time-indicator. The 
upper line is a record of addition. The numbers just above it show 
the tenths of a second used to think the result of the combination 
written next below. It will be noted that each record consists of 
two lines. Along the top line is marked each two-tenths of a second, 
while the depressions in the lower line show the time of writing. The 
elevations in the lower line show the time of getting ready to write. 





Fig. 2.—A kymographic record. 


The interpretation of the first line of Fig. 2 runs thus: The first 
depression shows the recording of something, probably 2, the sum 
of one and one, then as soon as this is done the person reacted on the 
combination 9 and 4, which took ten-tenths of a second. He then 
wrote 1, lifted his pen and wrote 3, consuming nearly eleven-tenths of 
a second in writing the sum, 13. This done, he used eight-tenths of a 
second in thinking the sum of 9 plus 7 is 16 and he then wrote the 
sixteen in eight-tenths of a second. By continuing in this way, we 


get the mental reaction time and then the physical reaction time for 
each combination. | 


DISTRIBUTION OF THE COMBINATIONS 


By taking the forty-five additive combinations and including 
the combinations with naught and their reverse forms, we find there 
are in all 100 different combinations. These were arranged at random 
upon five cards. The cards were 8 by 11 inches in size and had 24 
perforations or openings about 1 inch square. Above these openings 
the combinations were placed so that the results might be written in 
the openings when the card was placed upon the copper plate. 
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Carp I.—As arranged with the openings indicated. The arrangements for Cards II, 
III, IV and V appear below. 




















6 gee id vies ¢.6) ese. -' 4 
coo oe ae oe i Oe ee oe ee ee 
‘i 2a ae ae i Se ae or 
on oe oe oe a a a a ‘ 
‘ar [Ve ae eee .. “2 =~ 1 
‘a Ga ee one oe ‘fe ae So ee Se 
oo oa 13 7 5 7 1 
SS a Se so ae i 2 0 3 6 9 

Carp II. Carp III. i 
‘a ie a or > % ‘ie 7 ee 4 
“li a Se eae ee ‘Se tee ick ek . 
“a aa a oe, ee - "ae Pepeag. 4 
a oo ee oo . Qe! Seige igiwng 
i Re Ee A 1 a aaa 
= a oe ae ee . aa? +) |S 
a. ee _— os a a oe it 
‘ek ae ee eee ee , ee ae oe at 


Carp IV. Carp V. 








/ 


K 


472 The Journal of Educational Psychology 


It will be observed that each line begins with a one-one combin i- 
tion. In preparing these cards it was assumed that this was the 
simplest combination and would require very little effort either to 
think or to record its result. The kymographic record is such that 
for the first combination only the time of writing is recorded and 
when there is a transfer from the end of one line to the beginning of 
another, the time of transferring is added to the mental reaction time 
of the first combination in the line. In order to avoid these difficulties 
every line was started with the simple one-one. Although persons 
taking the test supposed these were equally important, this part of 
the record was not used. 

The cards were laid upon the copper plate and persons taking the 
test wrote the results in the openings upon the bare copper plate 
with the electric fountain pen. They wrote first the results of the 
one hundred combinations in multiplication and then the results in 
addition. The study was started by having the addition performed 
first but there seemed to be a tendency to persist in adding when taking 


the multiplication test afterwards, so that fewer errors were made by 
taking the multiplication first. 


THE Persons TESTED 


. With one exception, all those tested had taken work above the 
common grades. Ejighty-three individuals were tested. Their 
scholastic ranks were as follows: 1Ph.D., 22 B.A. or B.S., 8 collegiate 


seniors, 11 juniors, 19 sophomores, 12 freshmen, 9 of high school rank 
and 1 sixth grader. 


THE ARITHMETIC TExTs USED 


An attempt was made to get the name of the texts which had been 
studied by those tested but these were too scattering to have any one 
influence the results of the study. The authors reported were: Milne, 
13; Wentworth, 8; Ray, 5; White, Robinson, and Wentworth-Smith 
each 4; Hamilton, 3; while 7 were scattered. Thirty-five were unable 
to name the texts they had studied. The schools in which those tested 
had been trained were scattered from New England to California and 
only two reported having received instruction in the same school by 
the same teacher. These facts concerning the persons and the texts 
have been given to show the random of the sample upon which this 
study of number facility is based. It is believed to be fairly repre- 
sentative of the finished arithmetical training of the American schools. 
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the physical time median for writing the result after it had been 
thought out, and finally the sum of the mental and the physical 


ee ed 


medians, which represents the time for thinking and recording the 


While every combination was attempted 


by, at least, 60 persons, the maximum number of reactions gotten for 


result of the combination. 
any one combination was 84. 


TaBLe II.—MULTIPLICATION 
Time Unit is %9 Second. Read (xX) Multiplied by 
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Table I is a summary of results on addition while Table II sum- 
marizes results on multiplication. In order to aid the reader in making 
comparisons the combinations have been arranged with the doubles, 
naught-naught, one-one, two-two, etc., to nine-nine first and the others 
in ascending numerical value follow with the reverse forms in ad a- 
cent lines. 

ADDITION ERRORS 


The errors in addition have been summarized in Table III. Here 
each number has been taken separately. In interpreting Table III 
the reader should bear in mind that every number appears in nine 
combinations as the lower member of the combination and in the 
reverse forms of these it appears as the upper member of nine others, 
while it appears as both members of one combination. This makes 19 
combinations for every single digit number. 

In Table III are given (I) the numbers at the left end of the line 
along which are placed (II) the number of errors when the number is 
the lower member of the combination, (III) the number of errors when 
the number is the upper member of the combination, (IV) the errors 
when it is both members, (V) the total errors when the number appears 
in a combination; then (VI) the number of attempts on combinations, 
containing that number, (VII) the percentage of errors among the 
attempts, and (VIII) the per cent of the total errors for all numbers, 
which were made on combinations of the number. 


TaBLe III.—AppiT10n Errors 





Pas , 

. 
7 

fey 

















The errors made when Number | Percent | “er o* 
its place in combina- | Total | of combi- | of errors ct 
Number tion was ° ° errors to 
errors | nations in oft enenen 
Lower | Upper | Both attempted | attempts made 
I m | m | ivi v VI VII VIII 
5 19 18 1 38 1,296 2.93 15.45 
9 20 18 0 38 1,296 2.93 15.45 
4 10 14 1 25 1,296 1.93 10.16 
7 14 10 0 24 1,295 1.85 9.75 
1 12 7 4 23 1,302 1.76 9.35 
6 10 12 1 23 1,303 1.76 9.35 
3 8 14 0 22 1,288 1.70 8.94 
2 14 7 0 21 1,289 1.63 8.53 
8 7 11 1 19 1,294 1.47 7.72 
0 5 8 0 13 1,178 1.10 5.28 
Total...... 246 12,838 1.94 100.0 
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It is just a little surprising that five should appear as the arch 
offender in addition although nine is a close rival. The only dif- 
ference in the two being that the one error for a five combination, 
shown in column IV for both members of the combinations, involves 
five twice. Thus five appeared 39 times in 38 combinations in con- 
nection with errors, whereas 9 appeared only 38 times. Contrary to 
the belief of many educators, 1 has a high percentage of error. After 
studying several texts and charts on addition, the conclusion reached 
is that 1 holds this place because of its being neglected in the drill work 
of the schools and that it has failed to attain the proper habituation. 
By comparing the “‘lower” and the “upper” columns of Table III, 
it will be observed that the place which the number holds in the combi- 
nation matters very little. 


MULTIPLICATION ERRORS 


The errors in multiplication have been summarized in Table IV. 
This table is quite similar to Table III, except that the numbers in 
the multiplier and the multiplicand columns include the errors made 
on the combinations where both factors were the same number. The 
number in the both column is subtracted from the sum of the errors 
in the multiplier and the errors in the multiplicand to prevent these 
from being counted twice when giving the total number of errors. 


TaBLE IV.—MULTIPLICATION ERRORS 
































Errors when its place Per cent 
in combination was Number | Per cent of its 
Total | of combi- | of errors sini te 
Number | yuitj- | Multi. errors | nations in sities 
plier pli- | Both attempted | attempts d 
cand mace 
I II III IV V VI VI VIII 
0 85. 63 0 148 1,161 12.75 24.15 
9 29 39 | 3 65 1,376 4.72 10.60 
7 31 29 2: 58 1,384 4.19 9.46 
8 30 31 3 58 1,387 4.18 9.46 
4 24 36 4 56 1,471 3.87 9.14 
6 26 27 1 52 1,385 3.75 8.48 
5 27 27 3 51 1,428 3.57 8.32 
1 29 19 0 48 1,377 3.48 7.83 
3 18 26 2 42 1,360 3.08 6.85 
2 17 19 1 35 1,468 2.39 5.71 
,: ree - $s ne 613 13 ,797 4.43 100.00 
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Several questioned the advisability of including the naught in 
the investigation but the writers have observed so many mistakes in 
computations involving the naught that it was resolved to give this 
number its regular place in this investigation. It seems reasonable 
to say that combinations involving either naught or one are the easiest. 
It is quite probable that they appear unfavorably in this test because 
of the attitude of leading educators and teachers, which has tended to 
discourage drill on these numbers. 


TaBLeE V.—ApDDITION COMBINATIONS IN ORDER OF DECREASING TIME 






































For mental mec ans For mental and physical medians 
i 
A B C D A’ B’ Cc’ D’ 
5.50- 3.91- | 3.66- | 3.33- || 12.89 | 10.08- | 8.53- | 7.69- 
3.91 3.67 3.36 2.10 10.12 | 8.53 7.71 5.70 
tenths tenths | tenths | tenths || tenths | tenths | tenths | tenths 
sec. sec. sec, sec. sec. sec. sec. sec. 
9+5 0+3/7+9 | 34+4 | 94+5/)9+9)]7+4/] 341 
7+2 8+3|}/2+8 |7+3 | 6+9 |] 8+4]8+3/9+4+0 
3+8 6+7/9+4 /1+5 | 8+7/]44+1/]0+4+5 | 64+2 
0+1 7+4/134+9 |04+4 | 54+8/)74+9]444/)445 
6+9 5+9/|3+3 | 6+2 | 8+5/2+1]24+7/)]8+1 
2+1 4+4/9+8 | 44+2 | 94+6|74+7|]14+8 | 541 
5+8 7+6/2+6 | 4+5 | 8+6/]8+8 | 5+2/0+2 
8+7 5+4/3+2!/3+0 | 4+9 | 8+2|]1+2 | 4+3 
6+5 34+7/4+6 | 0+5 | 7+8|1+4]24+9/] 6+1 
9+2 143/443 |5+5 1 5+8 |4+6/]6+4 | 244 
4+7 8+2/;0+0 | 0+2 | 7+6|3+2)5+4)|]3+6 
8+5 2+9/9+0 /1+7 | 6+7|5+0/]6+3/1+49 
3+5 4+8/1+0 | 3+6 | 6+8 | 2+8]3+0)|1+4+7 
8+9 2+5/2+0/)6+0 | 742/447 /]0+4/| 343 
8+6 2+4/8+8 !'14+9 | 94+4!64+5/]04+9 | 8+0 
4+1 |0+8/5+6 | 2471745 |34+212+4+6| 041 
9+7 5+2/16+6 | 2+2 | 9+3/)1+4+3 | 54+6)]3+4 
9+3 6+8 /5+3 | 44+0 | 9+8 | 94+2/]54+3 | 145 
0+6 7+1/3+2!| 644 | 5+7/|0+6/]0+8 | 0+0 
4+9 8+4/7+8 | 8+0) 4+8 | 3+5]74+1 | 442 
5+1 6+3/9+9 | 3+1 | 34+8]5+5]2+2]6+0 
9+6 0+9/1+1 | 7+7 | 8+9 | 04+3 | 2+0] 7+0 
74+5 1+4/8+1 | 0+7 | 34+9 | 34+7/]2+5 | 0+7 
5+7 64111423 | 740 || 047 | 04111411146 
1+8 5+0/}9+1 send allt 7+3 /|}/4+0/]1+40 





1See note, Table I. 
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RELATIVE DIFFICULTY OF GROUPS 


In Table V, the addition combinations have been arranged in 
the order of decreasing time medians. The combinations have been 
grouped at the left according to the mental medians and at the right 
according to the mental and physical medians combined. For con- 
venience the groups are designated, respectively A and A’ for the most 
difficult from the standpoint of time consu med, and D and D’ for the 


TaBLE VI.—MOULTIPLICATION CoMBINATIONS IN ORDER OF DECREASING TIME 

















6X3 7X0 
9X5 2x0 
2x4 0x9 
5 X 6 2X5 


For mental medians For mental and physical medians 
A B C D A’ B’ C’ D’ 
6.21 — 4.83- 4.35- 3.93- 16.90— | 13.33- | 10.55- | 8.18- 
4.87 4.36 3.97 3.11 13.40 10.66 8.35 6.92 
tenths tenths | tenths | tenths || tenths | tenths | tenths | tenths 
sec. sec. sec. sec. sec. sec. sec. sec. 
6x9 1x8 6x9|9x4 7X0 
4x4 9x6 9X 6 xX 5 9X1 
9x0 2x9 8X7 7X1 
0X6 5X3 7x5 3X2 
9X7 6x8 8 X 6 0X2 
8X7 3X3 8X8 3X2 
8 X 6 4x9 9x5 1x4 
8X8 6 xX 5 9X7 0x9 
8X 2 6X7 3X8 1x3 
4X0 7X9 6x8 0x1 
5X4 8X5 5x9 0x5 
7X4 4X7 6X7 8 xX 0 
4X2 4x5 5X7 
6X1 4X3 7X7 
5X5 
7X4 
7X8 
5 X 6 
7x9 
8x9 
4x8 
7X6 


CON WOWNWWwWORONANAOWR WON AO 
xxKxxKXXKXKXKKXKKKKXKKKKKKKKKKKKX 
em OOC}ON eK SIO KK OW WWOOKR KH ANKE NWN SO CO 
Kee eS WIOON KH KB RWWHWMWH OAWAAWAaAS © eS Ww 
xxxXxxKXKXKXKKKKKKKK KKK KKK KK KK XK 
WOTIR OWNONUOQROONK OF OCR OKRORKRWO OS 
WWNDDOWWOAOAQNIHAIWHEATOKRAOKH WAN 
xxXxXKXKKKKKKKKKKKXKKKKKKKX 
ION OOINHWNWAWOKRNWAOAWNW OS 
NNOCR OK WK KP KRWONAUARPNWNNON N 
xKxxXxXxXKKKKKKKKKKKKKKKKKKKKX 
NRK NOR MW WOW RK OON KF NWR ODA DW 
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1X2 7X5 1x0 
7X8 3X 6 1X7 
8X9 0X1 3 xX 0 
9X2 5 X 2 6 xX 0 
6 X 2 9x4 4x9 0x0 
3X8 7X7 8X4 0X2 
7X2 9x9 6x4 1x1 
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least difficult while groups B and B’ and groups C and C’ contain the 
combinations of intermediate difficulty. At the top of each column 
is found the time median for the first and last combination in the group. 
These medians give an idea of the range of difficulty within the group. 


ro REASONABLE STANDARDS 


By comparing the standards for speed set by those who have put 
out drill exercises for use in the public schools, there is found a con- 
siderable difference in time allowed by the different authors for the 
one hundred combinations in addition or multiplication. The follow- 
ing tabulations give the number of seconds allowed for the one 
hundred combinations in addition and multiplication by different 
writers: 








Number of seconds Number of seconds 
Name of author for 100 combinations | for 100 combinations 
in addition in multiplication 
NN a Sr i a dale 240 240 
NS cg A a care ee I 180 180 
SSS ee ae ee A 139 139 
: ckGaa ee kk Chin daa deka 115 138 
| ES a ae 93 114 











It should be stated that the standards set by Studebaker, Courtis, 
Guhin, and Clapp are for eighth grade pupils while the results of this 
study were derived from the reactions of mature people. One would 
infer from the above figures that the standards set for speed in handling 
the simple combinations are not as high now as they have been in the 
past or else there is improvement after leaving the eighth grade. 


TEACHERS’ JUDGMENTS 


In carrying on this study, we have been favored with the results 
of a vote by 246 county superintendents and teachers of Montana, who 
voted upon the fifteen combinations most difficult to learn and retain 
in both addition and multiplication. The results of this vote arranged 
in the descending order of votes received are found in the second and 
fifth columns of the following tabulations. 
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From Table V and Table VI, the fifteen most difficult combina- 
| tions, as determined by the mental time medians and by the mental 
| and physical time medians in this study, are placed in columns one 

and three for addition and in columns four and six for multiplication. 














The fifteen most difficult addition The fifteen most difficult multiplica- 
combinations as selected by tion combinations as selected by 
1 2 3 4 5 6 
Mental Teachers’ | Mentaland| Mental Teachers’ | Mental and 
medians judgment physical medians judgment physical 
medians medians 
| 
9+5 9+4 9+ 5 6 xX 9 9X 6 6x9 
7+2 9+7 6+9 4x4 9X7 9X 6 
3+8 9+8 8+7 9x0 7X6 8X7 
0+1 7+ 6 5+8 0x6 9x8 7X5 
6+9 8+7 8+5 9X7 9X3 8 X 6 
2+1 8+5 9+6 8X7 8X7 8x8 
5+8 9+6 8 +6 8 xX 6 7X4 9X5 
8+7 7+4 4+9 8x8 9x4 9X7 
5+5 8+3 7+8 8X2 7X3 3x8 
9+2 9+5 5+8 4x0 8 xX 6 6x8 
4+7 74+ 5 7+6 5X4 8x4 5x9 
8+5 8 +6 6+7 7X4 8x3 6X7 
34+5 6+5 6+8 4X2 9x5 5X7 
3+9 9+3 7+2 6X1 7X7 7. 
8 +6 9+2 9+4 6 xX 3 9x9 5X 5 




















i / 

I This comparison reveals a very low correlation and seems to indi- 
cate that either teachers’ judgment of difficulty is very unreliable 
and perhaps is determined more by the physical difficulty of recording 
the results than by the real mental difficulty of the combination, or _ 

Y habituation is carried to a greater degree of efficiency on combinations 
: ' understood to be hard than on those supposed to be easy. 


Pe CONCLUSIONS 


1. Tables I and II show the degree to which the simple combina- 
tions in addition and multiplication have become habituated by a 
group of adult people. The data in these tables show that the diffi- 
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. culties do not lie entirely with those numbers of larger numerical value, 
] as is generally supposed, but that numbers generally taken to need 
e little attention do give a large percentage of errors. 

; 2. The condition stated above may be the result of native difficulty 
of the combinations or poor distribution of practice. 

3. It has shown that a group of presumably average persons, who 
have completed our public schools, possess a {acility in speed with 
numbers that exceeds that required by most of the standards set by ¢ 
. educators for pupils of the upper common grades. bf 

4. Since a number of those tested were well past 30 years of age hey 





and since some of them declared that they had not been interested in ne 
arithmetic and had had very little to do with it for 15 or more years, ‘aut 
the study tends to show that the facility acquired in the schools is 4 4 
not lost by not keeping in practice but continues and probably increases, Med 


at least until middle life is reached; or else that they obtained a much (ie id 
greater facility in handling number combinations than is now demanded. 

5. The range of relative mental difficulty of the combinations is 
not so great as the range of relative physical difficulty of writing results. 


6. Finally the study provides a basis by which drill exercises used in wey 
our schools now and also those that may be prepared later may be +: 
checked to determine to what extent drill is provided where drill is * 
needed. a 

BIBLIOGRAPHY a 


. Lennes, N. J. and Jenkins, F.: ‘‘ Applied Arithmetic.”” Lippincott, 1920. ay i 

. Anderson, R. F.: “Anderson Arithmetic.” Silver, Burdett & Co., 1921. it 

Guhin, H. H.: “Practical Method Example Book.” Hub City Supply Co., ty 
$ 


wnre 


Aberdeen, 8. Dak. 

4. Studebaker, J. W.: “Economy Practice Exercises in Arithmetic.” Scott 
Foresman Co., 1915. 

5. Courtis, S. A.: “Standard Practice Tests in Arithmetic.’”’ World Book Co., 
1920. 

6. Clapp, F. L.: “Standard School Tests, Texts A and B.”’ University of Wis- 
consin, 1921. 


Se 


a “ eck ona Se 
eg ens HF 


4 


- 
We 
> 


5a 
ite 
y 








a 


‘4 
A 
i] 


” . on = 
~ Bertie eg at 





SOME DATA AS TO THE EFFECT OF PREVIOUS 
TESTING UPON INTELLIGENCE TEST SCORES 


C. W. ODELL 
University of Illinois 


It is the purpose of this article to present two sorts of data which 
offer some evidence as to the effect upon the intelligence test scores 
made by pupils of their having taken one or more similar tests at some 
previous time. The first set of data presented deal with a situation 
resembling that which occurs in the ordinary school use of intelligence 
tests; the second set with unusual and experimental conditions which 
nevertheless may throw some light upon the question at issue. 

During the school year 1923-24 the Bureau of Educational Research 
of the University of Illinois gathered a number of items of information 
concerning each of about one-half of the high school seniors in the 
State.! Among the questions asked were the following: 

“Have you ever taken an intelligence test before?” and, ‘‘If so, 
when?” Furthermore, the Otis Self-administering Test of Mental 
Ability, Higher Examination, Form A, was given to all those included in 
the study. The seniors were divided according to the answers given to 
the two questions. Those who stated that they had never taken an 
intelligence test before formed one group, whereas those who had 
done so were divided according to whether they had been tested within 
the past year or not. In making these tabulations seniors from schools 
in{which either all or none stated that they had been previously tested 
were not included. Furthermore, each of the three groups was divided 
into five classes according to the sizes of the schools.2 The total 
number of seniors included in these tabulations was 5283 and the total 
number of schools 113. 

Table I presents the medians of the distributions of scores just 
referred to. It will be seen that on the whole there is a well-marked 
tendency for the median scores of those who had never been tested 
before to be somewhat lower than the medians of those who had taken 
intelligence tests previously. Although in four of the five classes 





1 A more complete account of this study may be found in the following publica- 
tion: Odell, Charles W.: Conservation of Intelligence in Illinois High Schools. 
University of Illinois Bulletin, Vol. 22, No. 25; Bureau of Educational Research 
Bulletin No. 22. Urbana: University of Illinois, 1925, 55 pp. 

2 Class I included schools having 1000 or more pupils; Class II, those having 
500 to 999; Class III, 300 to 499; Class IV, 100 to 299; and Class V, less than 100. 
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TaBLE I.—Mep1an Pornt Scores or Hicu Scuoot Seniors ACCORDING TO 
Previous EXPERIENCE WITH INTELLIGENCE Tests AND ALSO ACCORDING 
To Sizes or ScHOOLs 


























siciian Tested more 
Class Not tested Tested within chiens 6'sdee All 
previously the past year : 
previously 
3 
: é 
, I 46.4 46.9 48.3 47.2 
II 46.1 49.0 46.8 46.8 
III 47.0 50.2 48.7 48.2 
IV 43.2 47.2 46.7 45.2 
V 42.2 44.6 42.5 42.7 
All 45.1 47.3 47.4 46.3 
| 
| . & 
. the median scores of the seniors tested within the past year are greater fg 


than of those tested more than one year previously, yet for the whole _ ! 
| group the two medians are almost exactly the same.'! For the total ie 
| group the superiority of those who had been tested at some previous 
| time was slightly more than two points. Expressed in terms of intel- 
) 


ligence quotients, this equals a difference of the same number of points ety ts 

| since, at the average age of high school seniors, one added point on the a 
test score is practically equal to one point of IQ. aa 

In addition to this tabulation of all the cases included, comparisons a 

. were made between the scores of pupils in the three groups in each hat 
school in which the numbers composing them were large enough to pac 

yield fairly reliable medians. These results may be briefly summarized mt 

as follows: The number of schools in which the groups that had never a 

been tested had lower medians than the groups tested within the past re 

year was eight, whereas there were four in which the reverse was true. t a 

In seventeen schools those who had never been tested had lower eh 

medians than those who had been tested more than one year pre- aT 


viously, whereas the medians of the latter were lower in four cases. Rox, 
Those who had been tested within the past year had higher medians 
in seven schools than did those who had been tested more than a year 
previously, whereas the reverse was true in six instances. 

The data presented may, it seems to the writer, be fairly sum- 
marized by saying that on the whole the effect of such previous 





median score of those tested more than a year previously is the greater, were to be 


A J 

1 This is to be explained by the fact that in Class I, the only class in which the 1 
found two-fifths of the seniors concerned. ’ 
: 





| 
| 
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testing as is commonly carried out in school, is to raise the intelligence 
quotient upon a later test slightly more than two points, and that 
whether the earlier testing has occurred within the past year or 
previous to that time appears to make little difference. 

For most purposes for which intelligence test results are employed 
a difference of two points in IQ is negligible, especially in view of the 
fact that the known probable error of the [Q’s computed from any group 
or even individual test is considerably greater than this amount. 
We may, therefore, conclude that at least insofar as scores on the Otis 
Self-administering Test of Mental Ability, Higher Examination, 
Form A are concerned, the results obtained from high school seniors 
are, on the whole, practically the same whether they have been tested 
previously or not. 

The second set of data to be considered have been secured from an 
experiment conducted to determine the effect of deliberate practice 
and coaching upon the scores made on intelligence tests.! In this 
experiment Army Alpha was given to a number of pupils in grades 
VII and VIII and high school, after which several weeks of practice 
and coaching upon material similar to that obtained in the test was 
given, accompanied by occasional re-testing with the several forms 
of Alpha. Soon after the conclusion of the experiment a final test 
with Army Alpha was given to ascertain how much the scores had been 
increased by the training. Something more than two years later the 
pupils were again measured with the same test in order to discover 
how much of the effect remained after so long a period of time. 

Table II summarizes the results obtained. This table shows the 
number of pupils in each age group, ages being taken to the nearest 
year and at the beginning of the experiment. The three sets of three 
columns each represent the results according to the different ways of 
computing IQ’s explained by the notes at the bottom of the table. 
These three methods were used because there is no general agreement 
as to just how IQ’s of persons 14 years of age and over should be 
computed. The first of each set of three columns contains the median 
IQ’s of the various age groups at the time the experiment began. 
The second column contains the medians two or three months later, 
based upon tests given soon after the training period was over. The 
third column contains the median IQ’s found something more than two 
years after the conclusion of the experiment or about two and one-half 





1 The original study was made by Professor H. N. Glick of the Massachusetts 
‘State Agricultural College. It will probably be published by the Bureau of 
Educational Research of the University of Illinois. 
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years after its beginning. It will be noticed that many of the medians 
computed by the two latter plans are the same as those computed by 
using the exact chronological ages. This is due to the fact that the 
use of 14 or 16 years as the limit of mental growth does not affect all 
of the pupils, many of them being too young to exceed these ages at 
the first two testing periods and some even at the final testing. 


Taste II.—Mepian INTELLIGENCE QUOTIENTS ACCORDING TO ARMY ALPHA 
ScorEs OF PUPILS BEFORE, SOON AFTER, AND More THAN Two YEARS AFTER 
A PerRIop oF SPECIAL COACHING 











IQ's computed on 1Q’s computed on IQ’s computed on 
exact chrono- assumption that men- | assumption that men- 
logical ages! tal growth ceases at tal growths cease at 
Nearest 14 years? 14 years? 
age ena w pve 
experiment | of pupi 2+ 2+ 2 
began B Soo B So B 
fore | after | Ye" | fore | after | Yee" | fore | ‘after | ¥oars 
train- | train- | sain | trainm- | train-| yin. | train- | train- pan oll 
ing ing ing ing ing ing ing ing ing 
11 2 144 | 176 | 150 | 144 | 176 | 150 | 144 | 176 | 150 
12 4 130 | 159 | 124 | 130 | 159 | 124 | 130 | 159 | 124 
13 7 127 | 167 | 126 | 127 | 167 | 126 | 127 | 167 | 135 
14 17 118 | 148 | 109 | 118 | 148 | 109 | 118 | 148 | 122 
15 11 110 | 158 | 107 | 110 | 158 | 113 | 117 | 169 | 128 
16 3 113 | 146 | 109 | 113 | 146 | 121 | 127 | 166 | 139 
All 44 119 | 154 | 115 | 119 | 154 | 118 | 120 | 157 | 130 



































1 These IQ’s were computed by dividing the mental age corresponding to the 
point score by the actual chronological age of the pupil at the time the correspond- 
ing test was taken. 

2 These IQ’s were computed the same as those in the three previous columns 
except that for all those of age 16 or above at the time of a given test their chrono- 
logical ages were assumed to be 16. 

3 These IQ’s were computed the same as those in the three previous columns 
except that for those of 14 or more years of age the chronological age was assumed 
to be 14. 


The results for all the pupils show that if exact chronological 
ages are used in computing IQ’s the median at the last testing is four 
points lower than that at the first testing; if it is assumed that mental 
growth ceases at 16, the median scere is one point lower; but if it is 
assumed that it ceases at 14, it becomes 10 points higher. The 
increase in median IQ from the test just before the training period to 
that soon after its completion is 35 points if exact chronological ages 
are used, also 35 if a limit of 16 years is assumed, but is increased to 
37 if a limit of 14 is employed. By comparison it is evident that by 
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far the greater part of the gain due to special training was temporary 
and did not persist throughout a period of something more than two 
years. Even according to the method of computation which yields 
the greatest increase in IQ’s from the last testing over the first, this 
increase is only 10 points or slightly more than one-fourth of that 
produced by the training.! According to the other two methods of 
computation the median IQ’s at the last testing are actually slightly 
less than those at the first. 

From the practical standpoint of administering intelligence tests 
in a school system one is not concerned with a situation in which the 
pupils have had the amount of deliberate and carefully directed prep- 
aration given in this case. It doubtless occurs that in some cases 
teachers or others endeavor to prepare pupils for intelligence tests or 
that pupils themselves secure copies of the tests and make some prep- 
aration. However, even in such instances, which the writer cannot 
believe are very frequent, it seems unlikely that the amount of prep- 
aration is at all equal to that given in the experiment just reported. 
However, if a test is given within a short time after such preparation 
it is very likely that the increase in scores will be greater than that 
between the first two of the tests discussed above. On a test given 
any considerable time later it appears that this effect will be negligible. 

On the basis of the evidence afforded by the two sets of data 
presented and discussed in this article, it seems to the writer that the 
conclusion may be drawn that such previous testing and even occasional 
special training as falls within the experience of most public school 
pupils is not likely to cause serious errors to enter into the scores which 
they make upon intelligence tests at some later date and that, there- 
fore, there is ordinarily no necessity to attempt to correct these scores 
for any practice effect. This conclusion, however, does not hold in 
case a test is given within a short time after a previous test has been 
taken. Various experiments of this sort seem to indicate that the 
repetition of another form of the same test or of some other test which 
is very similar within a few days after the first test has been made 
results in an average increase in the scores of about 10 per cent. A 
third test given under the same conditions, will perhaps result in a3 
or 4 per cent increase; a fourth test will add a little more; and soon for 
perhaps two og three more repetitions. 





1 The actual increase due to practice and coaching was considerably greater 
than is shown by the figures given in this article. The second testing was not 
given immediately at the close of the training period, but a few days or even weeks 
later, and therefove a considerable part of the increase in ability to make high 
scores produced by the training had already been lost. 
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SUCCESS IN TYPEWRITING 


MINNIE A. VAVRA 
St. Louis, Mo. 


An intensive study of the prognosis of typewriting ability and the 
prevention of failures was made at the Grover Cleveland High School 
of St. Louis. It was the joint work of Miss Mable Easterbrook and 
myself. We worked together, agreeing on methods of procedure, 
cooperating faithfully in carrying out the plan in every detail, and 
concurring in the final conclusions. 

Present day conditions make the results of this study of interest 
to all educators. One of these conditions, which is quite universal, 
is the increase in the number of pupils of inferior mental ability now in 
our high schools. This is not due so much perhaps to an increase in 
the percentage coming from this group, although 50 per cent of those 
in the low group in Grade VIII come to the Cleveland High School, 
but an increase in all, necessarily means an increase in the number with 
low intelligence quotients. In addition, there is a decided effort to 
increase efficiency in our schools by decreasing the number of failures. 
Today, also, educators are feeling an increasing responsibility for 
helping each pupil make the best of his ability, as is evidenced by the 
widespread interest in Vocational Guidance. They are trying to 
keep the ‘‘square peg out of the round hole,” to prevent the discour- 
agement that frequently accompanies failure, and to save time and 
money for the pupil, teacher, and the public. 

Hence the results obtained in a careful study of 332 subjects of the 
usual high school age under classroom conditions are interesting. 
Each pupil in beginning typewriting was given a prognostic test. A 
year before we began our experiment Miss Mary Lynch who was at 
that time teaching at Cleveland was experimenting with a prognostic 
test which she made up while studying at Chicago University in the 
Summer session. This test is a kind of substitution test. Since it 
involves those qualities that are among the requirements for successful 
typists, this kind of a test would be likely to show a good correlation 
with typewriting achievement. 

The test consists of 100 occurrences of the seven letters chosen in 
the key. Here is a copy of the test with the accompanying directions. 
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Directions 


In the key below, a number appears under each of seven letters. 
When I say Go write below the letters of this page the number that 
appears in the key. Take the letters in the order given. 


KEY 

i i i i ie a 

1 2 3 4 § 6 7 
U M Z = A S M xX A U 
x S M Z Y xX A U xX Y 
Y Z U M Ss Y xX A S M 
S Y Z U xX A S M Z U 
U Y M x Z Y U M A Z 
A Z M x Y S U A x Y 
M x U 4 S A Z M U A 
S = S M U Z A xX S Z 
Z U xX M S Y Z Z U xX 
U x Z Y xX A S Y M U 


This test was given as a group test, about 30 pupils being in a 
group at one time on each of three consecutive days. The time allowed 
was two minutes during the Fall term and one and a half minutes in 
the Spring term. The reduction in the time was made because it was 
found that several 'pupils finished as much as 25 seconds before the 
end of the two minutes. The method of scoring was number com- 
pleted minus five points for each error. Then an average of the results 
made in the three efforts was taken as that individual’s score. From 
these scores the arithmetical average of the entire group was calcu- 
lated. The average score of 181 pupils in the Fall term was 60; for 
the 151 pupils in the Spring term, 62. 

The achievement was represented by the term grade given at the 
end of 20 weeks. It is significant to explain here that this typewriting 
grade was secured by the application of a formula of marking. Each 
factor was definite and fixed and the grade received depended exactly 
upon the number of units satisfactorily completed. In other words 
intangible factors such as “attitude,” “‘interest,’’ “homework,” and 
‘general information” were included only insofar as they produced 
results in typed copy. However, extra time was permitted and some 
weak students secured a passing grade by putting in considerable extra 
time. The coefficient of correlation was undoubtedly affected by this 
factor as also by the wide range in typewriting grades, which extend 
from 20 to 99. The very low grades were the result of the inability 
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of the weak student to take the final examination which consisted of a 
timed test on new matter covering all the letters on.the keyboard. 
Hence the final examination grade was zero and the grade on daily 
work was thereby reduced one-fifth. This made the class average low, 
about five points lower than the median grade. 

The typewriting grade average was 69 in the Fall term, the sub- 
stitution test average was 60, and the coefficient of correlation was .25. 
This result was not considered satisfactory since it would be generally 
conceded that a coefficient of correlation to be worth much for progno- 
sis should be over 40. During the Fall term a careful watch was kept 
on all typewriting work to determine points of unusual difficulty and 
after due consideration the schedules of work were revised in accord- 
ance with these observations. The time was reduced to one and a 
half minutes, as I stated before. Under these conditions the average 
for the substitution test was 62, and the typewriting grade average 
was 72. These form the new bases for the coefficient of correlation 
which was found to be .48 plus for the spring term. Since this coeffi- 
cient is high enough to have prognostic value it may be concluded that 
the test is satisfactory. 

However a study of the data was continued by classifying and 
analyzing to see what conclusions could be drawn. A study of the 
tables giving the number of pupils making grades F, P, M, G, and E 
respectively in relation to scores made on the substitution test showed 
that to secure a majority of F’s and a majority of E’s the line had to 
be drawn at 60. After drawing this line we saw that 73 per cent of 
those who failed made a substitution score below 60. The percentage 
of F’s is three times as great in the group below 60 as in that above. 
On the other hand 80 per cent of the E’s went to those who scored 60 
or above on the substitution test, z.e., high grades were four times as 
frequent in the group above 60. 

A special study was made of those cases that had a failing grade 
when their score predicted success in typewriting. There were 24 of 
them out of the 332 tested. Fifteen of the 24 had a score only a trifle 
above the line of demarcation, 7.e., they made a score between 60 and 
69, and all were Grade IX pupils just entering from the grammar school. 
The remaining nine cases were analyzed. Three of these had physical 
handicaps, one having a broken arm, and two having poor eyesight. 
Three others were in the lower third of their class in grammar school 
and did poor work in all other subjects. The eleven cases that received 
E in spite of a low score were analyzed next. In all these cases the 
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score was slightly below 60 but above 50. All but two were older 
pupils taking four-year courses and therefore had been in high school 
two years or more. The two who were first term pupils were very 
earnest and one was over age and her work in subsequent terms was 
poor. 

It is interesting to note here that the speed demonstrators who 
visited Cleveland at that time took this same test and made a high 
score around 90 as would be expected. This special study serves to 
show that the substitution test is useful as a prognostic test and with 
the demarcation line at 60 is helpful in foretelling the probable success 
or failure in typewriting. It takes just a few minutes for a pupil to 
take this test, and if his score is very low, it is well to turn his attention 
to some other subject immediately instead of allowing him to spend 
his time in the typewriting room. 

In the next phase of the study, tabulations were made with a view 
to discovering what relation existed between accomplishments in 
typewriting and intelligence quotients as determined by the Terman 
Group Test. Educators confronted with the necessity of providing 
for the pupils of low intelligence have considered the study of languages, 
mathematics ‘and often science as “out of the question” for them. 
However, typewriting—‘‘something they can do with their hands’’— 
seemed plausible as a solution. But what are the facts in the case? 
The table showed that 56 per cent of the group with low intelligence 
quotients received a grade in typewriting below M. In the Below M 
group the percentage decreases with the rise in intelligence quotients. 
At the other end in the Above M group the percentage increases with 
the rise of the Intelligence Quotients. In the M group the percentage 
remains apy constant. 

An analysis was made of the cases which stand as apparent excep- 
tions to the general relationship, namely the 19 having a high intelli- 
gence quotient and a low typewriting grade. Of the 19 only three were 
doing in fother subjects the work which should be expected and these 
three ae very immature, being less than 13 years old. Only five of 
the 19 hd satisfactory substitution scores. 

One interesting fact is that while the substitution score and intelli- 
gence quotient each indicate fairly well the probable success or failure 
in typpwriting, taken together they form an almost perfect indication. 
Ther} was only one person in the 332 who had both a substitution 
scored below the line of demarcation and a low intelligence quotient 
and fyet received a high grade in typewriting. This pupil was over 
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age and her work in subsequent terms was poor, thus actually not prov- 
ing an exception. On the other hand only five of this large number 
of 332 had both a high substitution score and a high intelligence 
quotient and failed in typewriting. One of these passed on the final 
examination. A second had poor eyesight and did excellent work 
after this defect was corrected. A third did excellent work until he 
decided to change his course, and would not make further efforts. 
The fourth was absent frequently and the last did poor work in all 
subjects. 

Another phase of the study considered the relation between type- 
writing achievement and time spent in high school. The facts col- 
lected here are interesting in throwing light upon the question of the 
advisability of making typewriting one of the junior high school 
subjects. It may be pertinent to explain that there are two courses 
in which typewriting is a required subject, the two-year commercial 
course and the four-year commercial course. Those that select the 
two-year course begin typewriting immediately upon entering or in 
their first term. Those enrolled in the four-year commercial courses 
begin typewriting their fifth term or in their third year. Typewriting 
is also an elective in the third year of the general course. 

In general our first term or Grade IX pupils are two years younger 
than the pupils from the four-year courses and have taken the short 
course either because of the need or desire to get out of school quickly. 
The data revealed that for the lower term pupils the percentage of 
poor grades is over three times as great as for the higher term pupils, 
while the percentage of high grades is less than one-third as great. 

The question naturally arises as to why so great a difference exists 
between the two groups. Are the four-year students a selected group? 
If so, upon what basis? Not upon native intelligence for strangely 
enough the average of intelligence quotients was the same for both 
groups, being 102, which was five points below the average for the 
whole school. It would be justifiable then to conclude that the 
superior records were caused by the greater maturity or the presence of 
those social qualities which have kept them in high school for two 
years, or both. 

The last phase of the problem was to study to what extent a poor 
start in typewriting was overcome by additional time. For this 
purpose the histories of the 50 who failed Typewriting 1 were investi- 
gated. Five left school, six changed their course, one passed on 
examination, and 37 repeated. Of these repeaters one failed a second 
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time, and only 17 of the 37 had a grade of M after spending a whole 
year on first term work. Between the second and third terms 13 
more left school, two more changed their course, two dropped type- 
writing for a term, and 20 of the original 50 enrolled in Type- 
writing 2. Three of these failed and only five or one-tenth of the 
original number received a grade above M which would be necessary 
to indicate a practical mastery of the keyboard or to entitle a pupil to 
a recommendation from the department. While in these five cases 
the substitution scores were below 60, only one was below 50 and no 
Intelligence Quotient was below 96. 

In summing up this study the following points may be noted: 

1. That the substitution test indicates the presence of those quali- 
ties that make for success in typewriting, and in 85 per cent of the 
cases tested, its predictions of success or failure in typewriting were 
fulfilled. Therefore, the substitution test gives a fairly good prognosis 
of ability to learn typewriting. 

2. That intelligence quotients furnish a good indication of ability 
to acquire typewriting skill. 

3. That taken together the substitution scores and intelligence 
quotients form an almost perfect prognosis, only five cases out of 332, 
or less than 2 per cent failed to bear out indications. Therefore taken 
together they are valuable in revealing those qualities that make for 
success or failure in typewriting. 

4. That from a study of the superior records made by the fifth 
term pupils, time spent in high school and maturity have a place 
decidedly among the factors that determine success in typewriting. 
The facts collected are valuable in answering the question of how early 
in the course typewriting should be introduced. 

5. That the time element is not as important as has been supposed 
since so few of those failing are able to come up to the standard even 
after double time. 








THE RELIABILITY OF MEASUREMENT BY GROUP 
TESTS OF MENTAL ABILITY 


C. L. HUFFAKER 


University of Arizona 


Recently Herring! discussed the methods of determining the 
accuracy of measurement of mental tests. In this article Herring’s 
conclusions seem to point out that for very fine classification our 
present tests are inadequate. Herring used as a measure of accuracy 
the error which exists in estimating the score which would be made on 
a second comparable form of the same test. Kelley? has shown that 
it is possible to estimate a true score in intelligence with a smaller error 
than a second fallible score; or, in other words, the standard error 
involved in estimating a true measure of intelligence is less than the 
standard error involved in estimating a score in a second comparable 
test. In determining the accuracy with which tests measure, more 
satisfactory results are obtained by finding the deviations from true 
scores than the deviations from other fallible scores. 

The generally accepted meaning of mental measurement is based 
upon the Standard Revision of the Binet Scale. Terman states that 
1Q’s found by the Stanford Revision of the Binet Scale are distributed 
very closely in accordance with the law of normal distribution and with 
a spread so that the middle fifty per cent fall within the range of IQ’s 
from 92 to 108. This means that with a test the reliability of which 
is approximately .93 the standard deviation of IQ’s in unselected age 
groups is very close to 12. In a group of unselected 12-year-olds we 
would find that the standard deviation of the IQ’s as determined by 
the Stanford Revision of the Binet Scale is 12 points and of mental 
ages 18 months. 

As already pointed out the standard deviation of unselected 
12-year-olds may be regarded as 18 months of mental age. The 
reliability coefficient in this case is probably .93. If the test had a 
higher reliability than the Standard Revision of the Binet Scale 
the obtained standard deviation would be smaller than 18 months. 
In the construction of Table I this decrease or increase of the standard 
deviation has been taken into consideration as well as the reliability of 
the test. 


1 Herring, John P.: Verification of Group Examinations. Journal of Educa- 
tional Psychology, Dec., 1924. 
2 Kelley, Truman L.: “Statistical Method.” P. 214. 
493 

















: 

‘ 

4 

, 

/ 

; 

‘ 

ns ¢ 
“ 





494 The Journal of Educational Psychology 


In the construction of Table I, r represents the reliability coeffi- 
cient as determined by the correlation between the mental ages 
obtained by giving the comparable forms of the same test. This 
correlation for unselected 12-year-olds may or may not be the same as 
that obtained by correlating the IQ’s derived from these mental ages. 
The standard deviation of IQ’s is constant for any age group, while 
the standard deviation for mental ages increases with successive age 
groups until the class of adults is reached. For this reason r as deter- 
mined from IQ’s will be constant while r as determined from mental 
ages will increase as the age of the group used becomes larger. 

The standard error of a score calculated by the formula o+/1 — r 
is constant for a given test for all ages. It necessarily follows then 
that the standard errors of the score as given in Table I for the varying 
reliabilities will apply to any age group. The standard error of an 
estimated true score as shown in the last column of Table I is calcu- 
lated from the formula o+/r — r? and is not constant for the varying 
ranges and for that reason applies only to unselected 12-year-olds. 





TaBLE I.—EsTIMATED STANDARD DEVIATIONS AND STANDARD ERRORS OF SCORES 
The o’s and Estimated Errors of Scores as Determined in Unselected 12-year-olds 











Standard error of a score 
obtained from o : 

sme men A Obtained | Estimated 
score true score 

1.00 17.36 0.00 0.00 

.99 17.43 1.74 1.74 

.98 17.52 2.48 2.45 

.97 17.61 3.05 3.00 

.96 17.70 3.54 3.47 

.95 17.79 3.98 3.88 

.94 17.90 4.38 4.25 

.93 18.00 4.76 4.59 

.92 18.09 5.12 4.91 

91 18.20 5.46 5.29 

.90 18.29 5.78 5.49 

.85 18.83 7.29 6.72 

.80 19.41 8.68 7.76 

.70 20.73 11.35 9.50 

.60 22.40 14.17 10.97 
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- TasLe II.—MIsMEASUREMENT PER THOUSAND 

" The Number of Errors of Measurement per Thousand of Unselected Age 

s Groups Exceeding Two Months, Four Months, Etc. for Varying 

S Reliabilities of the Tests, Based upon Obtained Scores 

4 —_ 

B r 

2 unselected | 2 4 6 8 10 | 12 | 18 | 24 | 30 | 36 | 42 

. 12-year-olds i 
l if 
: 1.00 0 0 0 ey 
; .99 250 | 21 0 fe, 

L .98 419 | 105 15 0 

r .97 509 | 187 | 48 8 0 

. 96 575 | 263 | 93] 25 5 0 


95 517 | 317 | 1384 | 45) 12 3 
.94 646 | 358 | 168 | 66) 21 6 
} .93 674 | 401 | 208 | 93) 36) 12 
92 697 | 435 | 2341] 119 | 51) 19 
91 711 | 459 | 267 | 139 | 64) 26 





SS om 
RBROEANOSCSCOCOS 


} .90 726 | 484 | 294 | 162 | 80] 36 0 ot 
| .85 787 | 589 | 418 | 280 | 177 | 105 0 | 
| 80 | 818 | 646 | 490 | 358 | 250 | 168 6} 0 
.70 | 867 | 719 | 589 | 472 | 368 | 280 mi vi « ira 
.60 889 | 779 | 674 | 575 | 484 | 401 93] 36| 12] 3 min 
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TaBLeE III.—MIsMEASUREMENT PER THOUSAND 











The Number of Errors of Measurement per Thousand of Unselected 12-year-olds be 
Exceeding Two Months, Four Months, Etc. for Varying Reliabilities of Tests, tic 
Based upon Estimated True Scores av 

a 
4 St 

unselected | 2 4 6 8 10 12. 18 24 30 36 

12-year-olds ve 
| tc 
| = e] 


1.00 0 0 
.99 250 | 21 0 
. 98 412 | 101 14 0 
.97 503 | 180 | 44 7 0 
. 96 562 | 246 | 101 | 20 4 0 
.95 603 | 298 | 119 | 38 9 2 
.94 638 | 347 | 159 | 60 19 5 
.93 660 | 379 | 187 | 7 28 8 
. 92 682 | 412 | 219 | 101 40 14 
.91 704 | 447 | 254 | 129 57 25 
.90 719 | 472 | 290 | 150 72 31 





_ 
—_ 
RSSxaccCCOCS 


.85 764 | 549 | 368 | 230 | 134 72 0 

.80 795 | 603 | 435 | 298 | 194} 119 2 0 

.70 834 | 674 | 529 | 401 | 294) 208 12 2 0 
.60 857 | 719 | 589 | 472 | 368 | 280 31 7 0 
































From these tables it may be seen that the error involved in the 
use of tests is considerably less than the error given in Herring’s 
article. Tables II and III are read thus: 

If the reliability coefficient of a test in an unselected 12-year-old 
age group is 92, then 

697 per thousand are mismeasured by more than 2 months 
435 per thousand are mismeasured by more than 4 months 
234 per thousand are mismeasured by more than 6 months 
119 per thousand are mismeasured by more than 8 months 


51 per thousand are mismeasured by more than 10 months 
19 per thousand are mismeasured by more than 12 months 


If, however, we use estimated true scores, we find that the numbers 
mismeasured per thousand of unselected 12-year-olds are 
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By the use of good group tests the error of measurement need not 
be very large. For illustration, Otis' reports that the higher examina- 
tion of the Otis Self Administering tests of mental ability give an 
average reliability of .921 in a range of which the probable error of 
a score is 2.62. In an unselected age group, namely, a group the 
Standard deviation of which is 12 points IQ, this test would have a 
reliability of .90. If both forms were used, it would not be too much 
to expect the reliability to be approximately .95. In this case the 
error of measurement would be as follows for unselected 12-year-olds: 


617 per thousand would be mismeasured by more than 2 months 
317 per thousand would be mismeasured by more than 4 months 
134 per thousand would be mismeasured by more than 6 months 
45 per thousand would be mismeasured by more than 8 months 
12 per thousand would be mismeasured by more than 10 months 
3 per thousand would be mismeasured by more than 12 months 


If we reduce this to intelligence quotients, we find that 547 per 
thousand are correct to within two points or less; 866 per thousand 
are correct to within four points or less; 976 per thousand are correct to 
within six points or less; 997 per thousand are correct within eight 
points or less. 

This is not to be taken as an argument that a group test is as 
valid as an individual test, but merely to show that a group test may 
possess a very high degree of reliability. 

If may be concluded that if two forms of a good group test are used 
that there is considerable mismeasurement but that for ordinary 
school purposes the group test, providing its validity is established, 
is a reliable working instrument. 





1 Otis, Arthus S.: “Self Administering Tests of Mental Ability.”” Manual of 
Directions, p. 12. 
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REPLY TO HUFFAKER’S CRITICISM 


JOHN P. HERRING 
Teachers College, Columbia University 


Huffaker (1925) criticises conclusions which I presented for the 
purpose of influencing the construction, publication, and selection of 
tests toward higher standards of reliability; of avoiding premature 
uncritical complacency, which tends to stop improvement in favor of 
processes offering less resistance; and of emphasizing the reliability 
of tests which are reliable. 

Let the phrase standard group’ refer to unselected twelve-year-old 
children. 

The important point concerns the magnitude of the true standard 
deviation of mental ages in the standard group. The best estimate 
of this constant which I can conveniently make is based largely upon 
Willson’s (1925) recent collection of standard deviations, but includes 
Huffaker’s cases (presumably 83 in number) from Terman. 

N is 5670. The standard deviations are weighted in proportion to 
the number of children examined. The weighted average is 


(worv/ 711) 
W 





= 26.42 + 0.35 mental months 
in which 
w is a weight 


o is an obtained standard sigma,‘ and 
Ti, is as of a standard group. 


EsTIMATE OF THE TRUE STANDARD DEVIATION OF MENTAL AGES IN A STANDARD 

















Group 
per dard Number of 
eviations a aa 
Source N . separate Examinations 
in mental itininen 
months 
Terman (1917)....... 83 18.0 1 Stanford-Binet 
Willson (1925)....... 5587 26.81 5 Stanford-Binet, National 
Intelligence Tests, and 
others 
i i iid os intl ee 6 








Weighted average 26.42+ 0.35 mental months. 





1] have elsewhere suggested this terminology. If Franzen’s idea for three 
standard groups, each with its standard sigma—say six-year-olds, twelve-year-olds, 
and some college group, should obtain—then the terminology would apply to each. 
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Reply to Huffaker’s Criticism 499 


Assuming continuation of the present tendency to report such 
data in standard groups, it will be as possible to make relatively precise 
estimates of true standard deviations for a series of ages, as it will be 
worth while to construct, for standard reference, separate tables of 
amounts of mismeasurement for each age. For such tables, ~/1—r 
should be used—not, as Huffaker affirms, »/r—r®. It is also he who 
suggests a reason for using +/1—r? under the precise circumstances 
namely that since the former is constant from age to age, it does, 
on that account represent the general state of affairs by means of 
a single table for all ages. It makes little quantitative difference to 
the general issue whether +/1—r? or +/1-—r is used, for as r approaches 
unity, both constants approach zero and also approach each other. 
In the useable range of r,, in standard groups, say from .85 to 1.00, 
the differences between them are too small to affect the conclusions—a 
fact suggested analogically by comparison of the latter two columns 
of Huffaker’s Table I with his Table III. The differences in point 
are of course smaller than those of the tables. 

It was no! my purpose to provide complete reference tables, but 
to portray our situation in universal terms. The general problem may 
be solved intelligibly by means of either »/1—r? or ~/1—r, as long as 
absolute values are either closely represented or not of special moment. 
In this instance they are both important and approximately true. 

Correction of standard deviations for unreliability of measurement 
is also a sound procedure, but it was judged not worth the time for 
the purpose, since again it would introduce only insignificant differ- 
ences in the generally useable range of r,,—a fact seen in Huffaker’s 
Table I. | 

The net result upon my Table I, due to change in standard devia- 
tion from 24 to 26.42 + 0.35 mental months, to use of 1/1 — r 
instead of »/1 — r?, and to correction of standard deviations for unre- 
liability, would be to leave it, within the range of sufficient precision, 
almost exactly as it is. It would in the same degree support the same 
conclusions and the same selective caution. 
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NOTES ON ARTICLES IN EDUCATIONAL 
PSYCHOLOGY IN CURRENT ISSUES OF 


@ie~ OTHER MAGAZINES — dy 


REPORTED BY C. 0. MATHEWS 
Graduate Student, Teachers College 
Columbia University 











INTELLIGENCE TESTING 


Intelligence Tests in Hobart College. C. M. Louttit. School and Society, 
Sept. 5, 1925, pp. 312. An article giving the results of the use of Army Alpha over 
a four-year period in Hobart and William Smith Colleges. 

The Effect of Incentives upon the Constancy of the IQ. Elizabeth B. Hurlock. 
The Pedagogical Seminary and Journal of Genetic Psychology, Sept., 1925, pp. 
422-434. Praise and reproof both tend to raise the IQ’S obtained on the N.I.T. 
and were almost equal in influence. 

Relation of Family Size to Intelligence of Offspring and Socio-economic Status of 
Family. J. Crosby Chapman and D. M. Wiggins. The Pedagogical Seminary 
and Journal of Genetic Psychology, Sept., 1925, pp. 414-421. In 650 cases size of 
family correlated negatively with IQ and with social status, —.33 and —.27 
respectively. Social status and size of family correlate +.32. 

Technique of Differential Diagnosis. Mervin A. Durea. The Journal of 
Delinquency, July, 1925, pp. 143-153. Thorough case studies must depend upon 
tests of special abilities and an analysis of all available data as well as upon intelli- 
gence tests. Two case studies given. 


ACHIEVEMENT TESTING 


A Test in Health Knowledge. Arthur I. Gates and Ruth Strang. Teachers 
College Record, June, 1925, pp. 867-880. A discussion of the aims, needs and uses 
of the test together with a summary of the method of its construction. 

A Contribution to the Technique of Constructing “ Best-answer’’ Tests. William 
H. Burton. The Elementary School Journal, June, 1925, pp. 762-770. Multiple 
choice tests in civics were given to pupils. Replies to these same questions in 
interviews are analyzed to determine how pupils select answers to such questions. 

The Influence of Standardized Tests on the Curriculum in Arithmetic. Clifford B. 
Upton. The Mathematics Teacher, April, 1925, pp. 194-208. A criticism of 
arithmetic tests which include problems of doubtful value and omit more important 
topics. 

"" Method for the Detection of Cheating in College Examinations. Ralph Gund- 
lach. School and Society, Aug. 15, 1925, pp. 215-216. An attempt to measure 
the amount of cheating in groups by the use of two forms of a true-false test. 

A Study in Sectioning in Elementary Psychology at Iowa State College. Nira M. 
Klise. School and Society, Aug. 29, 1925. Report of an experiment using intel- 
ligence examinations and periodic achievement tests with a system of shifting 
sections. 
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PsycHoLoGy OF LEARNING AND OF ScHOOL SUBJECTS 


The Influence of the Factor of Intelligence on the Form of the Learning Curve. 
Giles Murrel Ruch. Psychological Monographs, Vol. XXXLV, No. 7, whole 
No. 160, 1925. The form of a learning curve depends upon the type of function 
and the intelligence of the subjects. 

The Optimal Position of a Rest Period in Learning.—Vivienne Robinson 
McClatchy. The Journal of Experimental Psychology, Aug., 1925, pp. 251-277. 
An attempt to determine “‘whether or not the length of the rest period should be 
varied according to the position ofinterpolarion . . . and to determine its optimal 
locus in the process of learning. 

Attacking the Causes of Reading Deficiency. Laura Zirbes. Teachers College 
Record, June, 1925, pp. 856-866. A “tabular analysis of suggestions on diagnosis 
and remedial work” with special consideration of the causes of deficiency. 

Children’s Interest in Books and Magazines. A.M. Jordan. The Pedagogical 
Seminary and Journal of Genetic Psychology, Sept., 1925, pp. 455-469. A sum- 
mary of investigations carried on in public libraries and by questionnaire. 

Speed and Scholarship vs. Arithmetical Accuracy. W. W. Ludeman. School 
Science and Mathematics, May, 1925, pp. 522-524. With 44 subjects a correla- 
tion of .42 is found between speed and accuracy while a correlation of —.069 is 
found between scholarship and accuracy. 

A Study of the Factors of Success in First Year Algebra. Edwin W. Schreiber. 
The Mathematics Teacher, March, 1925, pp. 141-163. A study of the influence 
of arithmetic abilities, general intelligence, age and time of day upon success 
in algebra. 

TEACHERS’ MARKS 


Adjusting Marking Systems to Differences in Groups. W.F.Tidyman. School 
and Society, Aug. 22, 1925, pp. 247-248. Intelligence tests were used in a teachers 
college to indicate the number of students entitled to each mark. 

A Statistical Analysis of Some School Marks. Frank Sandon. The Forum of 
Education, Feb., 1925, pp. 24-31. That younger children of each class advanced 
more rapidly is shown by a correlation of school marks. 

The Value of Standards in Grading Examination Papers. L. A. Sharp. Pea- 
body Journal of Education, July, 1925, pp. 38-45. Teachers’ judgments of the 
difficulty of arithmetic questions are very unreliable. The variability of marks of 
122 teachers was reduced 80 per cent by the use of a standard. 


CHARACTER AND PERSONALITY 


Research on the Diagnosis of Pre-delinquent Tendencies. Lewis M. Terman. 
The Journal of Delinquency, July, 1925, pp. 124-130. A review of several tests 
of character traits and their significance in showing moral trends. 

A Scale of Promise and Its Application io Seventy-one Nine-year-old Gifted 
Children. Barbara 8S. Burks. The Pedagogical Seminary and Journal of Genetic 
Psychology, Sept., 1925, pp. 389-413. An article on the construction of a scale 
for use in predicting success and its application to seventy-nine of the gifted 
children being studied at Stanford University. 

Variable Factors Encountered in the Rating of Students. Joseph V. Hanna. 
School Science and Mathematics, May, 1925, pp. 481-488. Character ratings 
by teachers of similar subjects correlate noticeably higher than the ratings of these 
same qualities by teachers of dissimilar subjects. 
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How SMALL CHILDREN LEARN Worps 


A Study of Learning and Retention in Young Children, by Lois Hayden 
Meek, New York: Teachers College, Columbia University, Con- 
tributions, No. 164, 1925. Pp. IX + 96. 


This careful and detailed investigation is one of the many now being 
carried out and that is necessary for the improvement of the technique 
of instruction. The purpose of this research was to study the effect 
of the following factors upon the learning of young children in reading 
words: (1) varying amounts of initial practice, (2) varying amounts of 
later practice, and (3) similarity of associated words. 

Seventy-one children, of ages four, five, and six, in the Horace Mann 
Kindergarten and Nursery School were the subjects of the experiment. 
Six four-letter words were selected to be taught individually to each 
child. Associated with each word were five other four-letter 
words having the same initial letter, final letter, first two letters, final 
two letters, or middle two letters. 

The initial practice was varied in this way: the children learned one 
word until two consecutive correct recognitions could be made at the 
initial learning, one word with five consecutive correct recognitions, 
one word with eight. Dr. Meek found that learning to the point of 
eight initial recognitions insured less forgetting than to the point of 
five, and five recognitions less than two. The time interval between 
practice periods was 1, 2, 4, 9, 14 days. Eight initial recognitions 
required more time and effort, however; caused considerable dissatis- 
faction to the children; and was probably, on the whole, less desirable. 
Less thorough initial learning required more frequent practice periods. 
Analysis of the effects of varying amounts of later practice leads to the 
conclusion that a gradual development with relatively short, frequent 
practice periods is more efficient for young children than concentrated 
extended practice. 
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The analysis of cues used by children in learning to recognize words 
is interesting and suggestive of further research. These conclusions 
emerge. The initial letter was used as a cue more often than the final 
letter. The last two letters were used more often than the first two. 
The middle two letters were used least of allascues. Certain letters of 
groups of letters which have peculiar formations, such as i, g, ll, 0, k, 
were frequently selected as cues. 

This study seems to have been very thoughtfully planned, carried 
through with meticulous care, and reported with scientific attitude 
and completeness. a # 





A METHOD OF PREVENTING SCHOOL DELINQUENTS 


The Visiting Teacher Movement, by Julius J. Oppenheimer. Joint 
Committee on Methods of Preventing Delinquency, 50 East 42nd 
Street, New York City, 1925. Pp. XVII + 206. 


The visiting teacher movement which has been developing quietly 
but steadily during the past two decades promises much toward the 
solution of those problems of the child and the home which arise from 
the present widespread family instability. Serious problems of social 
adaptation are caused by such movements as immigration, the drift 
to the city, the migration of negroes northward; and by conditions 
resulting from a congested, impersonal, mechanical urban life. The 
school must help to prevent juvenile delinquency and to rectify tragic 
maladaptions growing out of broken homes, the weakening of religious 
controls, and the widespread deficiency of home supervision and 
guidance relative to moral restraints and a wise use of leisure time. It 
follows that the school must understand and guide the pupil in his 
home and social life if its educational efforts are to be truly successful. 
The visiting teacher is the agent through which the school removes 
or prevents as far as possible those handicaps of school children 
resulting from their social environment. 

Dr. Oppenheimer’s study represents the first comprehensive 
survey of the field made primarily from the point of view of the educa- 
tor. The object of the investigation is concisely stated by the author 
in the following words: ‘“‘The purpose of this study is to consider the 
place in the school organization for the visiting teacher; to show the 
relationships which exist between the visiting teacher and other school 
agents; to determine a group of functions which should be peculiar to 
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this service; to show the relationship of this service to the purposes of 
the school; to evaluate its importance to improvement in scholarship; 
and to outline the qualifications and training necessary for this type 
of educational service.” 

The sources of material for the study were: reports of visiting 
teachers’ work; two surveys of national scope; study and analysis of 
case records; interviews with visiting teachers and school adminis- 
trators; and questionnaires sent to social workers and various groups 
of school people. 

The report is a commendable piece of work. It will prove both 
interesting and enlightening to all educators who wish the school to 
render the greatest possible service to the child and to society. 

J. H. 





A MANUAL OF EXPERIMENTS FOR STUDENTS 


Experiments and Exercises in Educational Psychology, by Harvey A. 
Peterson. Public School Publishing Co., Bloomington, II1., 
1925. Pp. 200. 


This book of experiments and exercises is designed to meet the 
needs of students who are preparing to become teachers. The experi- 
ments cover a wide range of subject-matter and vary in technique from 
the use of standard tests, concretely defined, to attempts at subjective 
estimates of purposing in the acquisition of motor skills. 

The majority of the experiments are objective and require the 
application of statistical methods and carefully controlled experimental 
procedure. The materials are so organized that the experiments may 
be directed by students. This is the chief advantage of the series, 
although the statistical treatment of data would seem difficult for the 
average student of elementary psychology, unless the course were 
supplemented by a course in statistical methods or measurements, 

The directions are concise and carefully worded and report sheets 
are supplied for each experiment. The book is in loose leaf form so 
that the order of topics may be varied at the discretion of the instructor. 

Although devised as practical illustrations of the various subjects 
considered in a course in Educational Psychology, many of the exercises, 
suggestive in content, seem to be best fitted to develop a comprehen- 
sion of the necessity for standard methods in all scientific procedure. 

Bess V. CUNNINGHAM. 











